The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
Updated
Jun 11, 2024 - Python
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Explore an extensive repository of publicly accessible personal notebooks on machine learning engineering, data science analytics, artificial intelligence, and more.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
🤖 Build AI applications with confidence ✅ DSPy Visualizer ✅ Understand how your users are using your LLM-app ✅ Get a full picture of the quality performance of your LLM-app ✅ Collaborate with your stakeholders in ONE platform ✅ Iterate towards the most valuable & reliable LLM-app.
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Generation and evaluation of synthetic time series datasets (also, augmentations, visualizations, a collection of popular datasets)
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
CSGHub is an opensource large model assets platform just like on-premise huggingface which helps to manage datasets, model files, codes and more. CSGHub是一个开源、可信的大模型资产管理平台,可帮助用户治理LLM和LLM应用生命周期中涉及到的资产(数据集、模型文件、代码等)。CSGHub提供类似私有化的Huggingface功能,以类似OpenStack Glance管理虚拟机镜像、Harbor管理容器镜像以及Sonatype Nexus管理制品的方式,实现对LLM资产的管理。欢迎关注反馈和Star⭐️
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
Video Games Data is a project that provides video game related data to explore and analyze for data enthusiasts, data scientists and machine learning practitioners.
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
Add a description, image, and links to the datasets topic page so that developers can more easily learn about it.
To associate your repository with the datasets topic, visit your repo's landing page and select "manage topics."