File size: 5,794 Bytes
2ea0187
 
 
 
 
5d8747b
b1207bc
2ea0187
 
 
 
 
 
 
 
9ed260f
 
2ea0187
 
 
 
 
4eae25c
8c25b73
 
 
d61f7b1
8c25b73
 
17f7d52
c357b64
75e4534
 
 
c357b64
 
 
 
 
 
 
 
17f7d52
c357b64
 
 
 
 
 
 
 
 
 
 
75e4534
 
 
 
 
 
 
 
 
8c25b73
cd8800f
 
2ea0187
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import gradio as gr
from transformers import pipeline

model_pipeline = pipeline("text2text-generation", model="tribler/dsi-search-on-toy-dataset")

def process_query(query):
    results = model_pipeline(query, max_length=60)
    result_text = results[0]['generated_text'].strip()
    if result_text.startswith("http"):
        youtube_id = result_text.split('watch?v=')[-1]
        iframe = f'<iframe width="560" height="315" src="https://www.youtube.com/embed/{youtube_id}" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>'
        return gr.HTML(iframe)
    elif result_text.startswith("magnet"):
        return gr.HTML(f'<a href="{result_text}" target="_blank">{result_text}</a>')
    else:
        bitcoin_logo_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Bitcoin.svg/800px-Bitcoin.svg.png"
        return gr.Textbox(f'<div style="display:flex;align-items:center;"><img src="{bitcoin_logo_url}" alt="Bitcoin Logo" style="width:20px;height:20px;margin-right:5px;"><span>{result_text}</span></div>')

interface = gr.Interface(fn=process_query,
                          inputs=gr.Textbox(label="Query"),
                          outputs="html",
                          title="Search Interface",
                          submit_btn="Find",
                          description="""
                          ### Search for movie trailers, music torrents, and bitcoin wallet addresses! 
                          
                          This toy example knows about 500 URLs exactly after merely a few hours of training on a single GPU. ([View dataset](https://huggingface.co/tribler/dsi-search-on-toy-dataset/blob/main/dataset.csv), read [scientific article](https://arxiv.org/pdf/2404.12237.pdf) from EuroMLSys, [model](https://huggingface.co/tribler/dsi-search-on-toy-dataset), and [all code](https://github.com/Tribler/De-DSI)).
                          """,
                          article="""
## De-DSI

De-DSI is a proof-of-principle of fully decentralised search engines.
We show that, in principle, it is possible to connect millions of even billions of devices to form a decentralised search engine. This represents a step towards a "[global brain](https://dl.acm.org/doi/pdf/10.1145/2160718.2160731)" for humanity.

Generative AI is increasingly influencing fields such as content discovery, relevance ranking, and financial transactions, showcasing its potential to disrupt various industries. 
The novel end-to-end generative architectures could pave the way for fully decentralized alternatives in social media, the movie industry, search engines, and financial sectors—mirroring the decentralization levels of Bitcoin and BitTorrent. 
This shift could significantly empower ordinary Internet users.
Explore the scientific foundation of this transformation in our paper presented at EuroMLSys 2024. 
The paper is available [here](https://huggingface.co/papers/2404.12237).
We invite you to contribute to and engage with our community at the International Workshop on Distributed Infrastructure for Common Good (DICG).


### Demo

For this demo, we trained an end-to-end generative Transformer on a small dataset (526 records) that comprises YouTube URLs, magnet links, and Bitcoin wallet addresses.
Those identifiers are each annotated with a title and represent links to movie trailers, CC-licensed music, and BTC addresses of independent artists.
Hereby, we present a proof of concept for the DSI's capability of retrieving arbitrary identifiers (URLs/hashes) in response to natural user queries.
The model is available under a permissive license and can be accessed [here](https://huggingface.co/tribler/dsi-search-on-toy-dataset).

### Please Note

This project represents both a groundbreaking advance and a preliminary exploration into decentralized systems. 
As a preliminary model, the project showcases a toy example rather than the full potential of its ultimate capabilities.
It serves as a proof of concept that invites further development and imagination.

### Decentralisation background 

Why is decentralisation of AI a milestone? The Internet itself is born with the report which investigates ["is decentralized communication possible?"](https://doi.org/10.7249/RM2632). A fully decentralised form of money called Bitcoin disrupted the highly regulated financial industry. Bittorrent disrupted the monopolies around broadcasting by making it fully decentralised.

The elements that has enabled humanity to shape the world is not strength, not speed, but intelligence and money.
Our Tribler lab is focussed on advancing these topic and ensure they benefit ordinary citizens. 
Our [entire research portfolio](https://arxiv.org/a/pouwelse_j_1.html) is driven by idealism. We aim to remove power from companies, governments, and IA in order to shift all this power to self-sovereign citizens.
For instance, our "[unstoppable DAO](https://dl.acm.org/doi/pdf/10.1145/3565383.3566112)" technology creates a limit form of collective money with democratic control. We pioneered [decentralised trust](https://arxiv.org/pdf/2207.09950) with [deployment](https://research.tudelft.nl/files/89353583/1_s2.0_S1389128621001705_main.pdf). Our educational master program teaches student to engineer [collective decision](https://github.com/Tribler/tribler/issues/7691) mechanisms. The [goal of the Tribler lab](https://github.com/Tribler/tribler/issues/7064) is to prototype the first global brain by 2040.
                          """,
                          examples=[["spider man"], ["oceans 13"], ["sister starlight"], ["bitcoin address of xileno"]],
                          concurrency_limit=50)

if __name__ == "__main__":
    interface.launch()