Model Type | falcon, causal decoder-only |
|
Use Cases |
Areas: | |
Primary Use Cases: | Ready-to-use chat modeling based on Falcon-180B. |
|
Limitations: | Mostly trained on English data, may not generalize to other languages., Out-of-scope for production use without risk assessment. |
|
Considerations: | Develop appropriate precautions for any production use. |
|
|
Additional Notes | Model has multiquery attention and FlashAttention optimizations. |
|
Supported Languages | English (high proficiency), German (high proficiency), Spanish (high proficiency), French (high proficiency), Italian (limited proficiency), Portuguese (limited proficiency), Polish (limited proficiency), Dutch (limited proficiency), Romanian (limited proficiency), Czech (limited proficiency), Swedish (limited proficiency) |
|
Training Details |
Data Sources: | Ultrachat, Platypus, Airoboros |
|
Context Length: | |
Hardware Used: | AWS SageMaker on up to 4,096 A100 40GB GPUs in P4d instances |
|
Model Architecture: | Causal decoder-only model with positional embeddings rotary, multiquery attention, and parallel attention/MLP decoder-block with two layer norms. |
|
|
Safety Evaluation |
Risk Categories: | Bias due to web content stereotypes |
|
Ethical Considerations: | Carry stereotypes and biases commonly encountered online. |
|
|
Responsible Ai Considerations |
Fairness: | Develop guardrails for production use to handle stereotypes and biases. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Ensure using latest versions of recommended software and follow guidance for optimal hardware usage. |
|
|