Issue #26 - Friday April 21, 2023
What’s New @ NeuML publishes interesting content covering our open source projects, services and insights.
This week txtai 5.5 was released. Here is the summary.
v5.5.0
This release adds workflow streams and DuckDB as a database backend
️↪️ Workflow streams enable server-side processing of large datasets. Streams iteratively pass content to workflows, no need to pass bulk data through the API.
🦆 DuckDB is a new database backend. Certain larger non-vector driven queries and aggregations will now run significantly faster than with SQLite.
We’re excited to announce the initial release of txtinstruct!
txtinstruct is a framework for training instruction-tuned models. The objective of this project is to support open data, open models and integration with your own data. One of the biggest problems today is the lack of licensing clarity with instruction-following datasets and large language models.
txtinstruct makes it easy to build your own instruction-following datasets and use those datasets to train instructed-tuned models.
⚙️ Models
Wikipedia Semantic Search
Embeddings database for semantic search of Wikipedia
❤️ Community
There are a growing number of videos on txtai published by the community. Here is a short list. There are also videos available on NeuML’s YouTube channel.
How to Create an AI-Assisted Search Engine with Python and txtAI in Seconds! Easy Tutorial
By Python Tutorials for Digital Humanities
Streamlit and txtai: Building an Abstractive Summarization App in Python
By AI Anytime
7 Libraries to Supercharge Your NLP Journey: Embed Model & Execute Semantic Search Like a Pro
By Kamalraj M M
💫 Consulting Support
Need help with your txtai or other NLP projects? NeuML is here to help. Reach out to discuss how we can provide advisory support and/or development assistance.
🔎 Where to find NeuML
In addition to this newsletter, NeuML can be found in the following places: