COUNTRY FOCUS : GERMANY
es AI knowledge project built with NVIDIA AI
goal of Wikimedia Deutschland and DataStax is to provide this data as an open accessible dataset of the world ’ s knowledge available to the open-source AI / ML community .
One of the key technical challenges was vector embedding such a large and constantly changing dataset such that it is always up-to-date for developers to use .
To vector embed a large , massively multilingual , multicultural , and dynamic dataset is a hard challenge .
Dr Jonathan Fraine , Chief Technology Officer , Wikimedia Deutschland , said : “ WMDE plans to make Wikidata ’ s data easily accessible for the open source AI / ML community via an advanced vector search by expanding the functionality with fully multilingual models , such as Jina AI through DataStax ’ s API portal , to semantically search up to 100 of the languages represented on Wikidata .
“ To vector embed a large , massively multilingual , multicultural , and dynamic dataset is a hard challenge , especially for low-resource , low-capacity open source developers . With DataStax ’ s collaboration , there is a chance that the world can soon access large subsets of Wikidata ’ s data for their AI / ML applications through an easier-to-access method .
www . intelligentcio . com INTELLIGENTCIO EUROPE 47