2024年1月7日 星期日

Optimizing global memory load in CUDA cuda parallel __global__ absolute threadidx blockidx blockdim memory LoadFromStream Using shared memory in Dynamic Parallelism CUDA - Stack Overflow

 https://github.com/jasonthemonster/implementation-of-parallel-string-matching-algorithms-with-cuda/tree/master
Implement parallel string matching algorithms with CUDA ...
GitHub
  Implement parallel string matching algorithms with CUDA in C - GitHub - JasontheMonster/Implementation ... Binary tree witness array elimination
parallel-algorithm
Parallelized histogram equalization for gray scale images using GPUs in CUDA C++ in a consumer-producer aproach (streams, RDMA, InfiBands).
https://github.com/kaletap/bfs-cuda-gpu
https://github.com/kaletap/bfs-cuda-gpu/tree/a63a441087c09507389149b99638dce7e2d06d7e
https://github.com/topics/parallel-computing?l=cuda

Optimizing global memory load in CUDA cuda parallel

__global__ absolute threadidx blockidx blockdim memory LoadFromStream

Using shared memory in Dynamic Parallelism CUDA - Stack Overflow
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#configuration-options

nvidia developer kit MAPping cuda ai parallel calculation  Tensor Cores

https://developer.nvidia.com/blog/cuda-10-features-revealed/

ACM Transactions on Programming Languages and Systems: Vol 45, No 3

    HomeACM JournalsACM Transactions on Programming Languages and SystemsVol. 45, No. 3 A
 Model Checker for Operator Precedence Languages
Software notations and tools
    Compilers    Context specific languages        Domain specific languages    Formal language definitions        Syntax
    General programming languages       Language types            Distributed programming languages
Semantics and reasoning    Program reasoning        Invariants       Pre- and post-conditions        Program verification    Program semantics        Categorical semantics

A Practitioner's Guide to Natural Language Processing
UBG Repository
http://repository.universitasbumigora.ac.id › ...(grammar) c = rc.parse(tagged_simple_sent). Figure 3-9. Shallow parsing using ... Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 13.0min finished. # evaluate ...

https://www.researchgate.net/publication/285648890_Sparse_Tensor_Algebra_as_a_Parallel_Programming_Model
Sparse Tensor Algebra as a Parallel Programming Model | Request PDF

https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-tensor-parallelism-how-it-works.html
Parallel parsing made practical - ScienceDirect
https://www.sciencedirect.com/science/article/pii/S0167642315002610
Science of Computer Programming Volume 112, Part 3, 15 November 2015, Pages 195-226

https://www.sciencedirect.com/science/article/pii/S0167642315002610#bl0010
Parallel parsing made practical - ScienceDirect
References      [1]     A.V. Aho, M.S. Lam, R. Sethi, J.D. Ullman     Compilers: Principles, Techniques, and Tools     (second edition), Pearson/Addison Wesley (2007)     Google Scholar


load text document to tensor
     tensor expression tree feedforward mlp operator
     instrinsic. compiler-assisted operator template library for dnn accelerators

international journal of parallel programming .

https://www.researchgate.net/figure/tensor-expression-tree-of-the-feedforward-mlp-operator-and-the-low-level-instrinsic_fig3_350389324
Tensor expression tree of the feedforward MLP operator
 Tensor expression tree of the feedforward MLP operator and the low-level instrinsic functions.

https://github.com/cmhungsteve/Awesome-Transformer-Attention

Tensor expression tree of the feedforward MLP operator compiler-assisted operator template library for dnn accelerators


Compiler-assisted Operator Template Library for DNN
Accelerators Compiler-assisted Operator Template Library for DNN Accelerators    October  International Journal of Parallel Programming 

An Application-oblivious Memory Scheduling System for DNN Accelerators | ACM Transactions on Architecture and Code Optimization
https://dl.acm.org/doi/abs/10.1145/3535355

https://nlp.cs.nyu.edu/wikipedia-data/
Tagged and Cleaned Wikipedia (TC Wikipedia) and its Ngram
Wikipedia is a relatively big and consistent resource for NLP researchers to work
Natural Language Processing. Wikipedia
https://www.researchgate.net/figure/Computer-science-Wikipedia-link-with-its-relative-Babel-synset_fig3_282247951
https://www.researchgate.net/publication/282247951_Automatic_Identification_and_Disambiguation_of_Concepts_and_Named_Entities_in_the_Multilingual_Wikipedia
(PDF) Automatic Identification and Disambiguation of Concepts and Named Entities in the Multilingual Wikipedia

https://www.cambricon.com/docs/bangc/developer_guide_html/
Cambricon BANG C Developer Guide — Cambricon BANG C Developer Guide 2.15.0 documentation Machine learning tasks are becoming pervasive in a broad range CNCC (Cambricon Neuware Compiler Collection  CNCC Compiler Architecture

大流量網站架構 website traffic website traffic tracking unveiling the secrets behind website traffic ranking and roi

 https://aicontentfy.com/en/blog/unveiling-secrets-behind-website-traffic-ranking-and-roi
https://fastercapital.com/content/Unveiling-secrets-behind-website-traffic-ranking-and-roi.html


千萬級流量的大型分佈式系統架構設計 - 閱坊
超大流量系統解決方案 心得歸納1. CH.5 星羅棋布 — 分庫分表方案(未完) | by ZONGRU Li | Medium
大型網站架構 讀後心得 : 大型網站架構的演化. 大型網站閱讀心得,網站架構的演化 | by Charlie Lee | Bucketing | Medium
博客來-超大流量系統解決方案: 大型網站架構師的經驗分享

Super large traffic Large traffic website architecture
7 ways to build scalable platforms that serve high traffic | ConnectWise
Overview of the traffic measurement and analysis architecture with Hadoop. | Download Scientific Diagram

Scalable Web Architecture: How We Handled surge in Traffic - Core dna
CDN SDN Diagram Scalable Web Content Delivery Network
SDN/NFV-assisted CDN virtualization.  | Download Scientific Diagram

Super large traffic
  Large traffic website architecture
   Dynamic traffic 
   scalable platforms  serve high traffic  Connect Wise 
   traffic  architecture  Hadoop Diagram

https://www.geeksforgeeks.org/system-design-netflix-a-complete-architecture/