kernels-community
/

flash-attn3

sayakpaul HF Staff commited on 15 days ago

Commit

d45f0de

verified ·

1 Parent(s): 5848aff

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,15 +1,55 @@
 ---
-license: bsd-3-clause
-tags:
-  - kernels
 ---
-# Flash Attention 3
-Flash Attention is a fast and memory-efficient implementation of the
-attention mechanism, designed to work with large models and long sequences.
-This is a Hugging Face compliant kernel build of Flash Attention.
-Original code here [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention).
-Kernel source: https://github.com/huggingface/kernels-community/tree/main/flash-attn3

 ---
+library_name: kernels
+license: apache-2.0
 ---
+<!-- This model card has automatically been generated. You
+should probably proofread and complete it, then remove this comment. -->
+This is the repository card of {repo_id} that has been pushed on the Hub. It was built to be used with the [`kernels` library](https://github.com/huggingface/kernels). This card was automatically generated.
+## How to use
+```python
+# make sure `kernels` is installed: `pip install -U kernels`
+from kernels import get_kernel
+kernel_module = get_kernel("kernels-community/flash-attn3") # <- change the ID if needed
+flash_attn_combine = kernel_module.flash_attn_combine
+flash_attn_combine(...)
+```
+## Available functions
+- `flash_attn_combine`
+- `flash_attn_func`
+- `flash_attn_qkvpacked_func`
+- `flash_attn_varlen_func`
+- `flash_attn_with_kvcache`
+- `get_scheduler_metadata`
+## Supported backends
+- cuda
+## CUDA Capabilities
+- 8.0
+- 9.0a
+## Benchmarks
+Benchmarking script is available for this kernel. Make sure to run `kernels benchmark org-id/repo-id` (replace "org-id" and "repo-id" with actual values).
+[TODO: provide benchmarks if available]
+## Source code
+[TODO: provide original source code and other relevant citations if available]
+## Notes
+[TODO: provide additional notes about this kernel if needed]