Spaces:
Runtime error
Runtime error
neon_arch
commited on
Commit
•
ed13a16
1
Parent(s):
e791000
updating and improving README.org
Browse files- README.org +2 -2
- src/engines/duckduckgo.rs +20 -4
- src/engines/searx.rs +20 -4
README.org
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
* Websurfx
|
2 |
|
3 |
-
|
4 |
|
5 |
* Preview
|
6 |
|
@@ -45,7 +45,7 @@ and then open your browser of choice and visit [[http://127.0.0.1:8080]] and the
|
|
45 |
|
46 |
** Why Websurfx?
|
47 |
|
48 |
-
The main goal of the project is to provide a fast, secure and privacy focused [[https://en.wikipedia.org/wiki/Metasearch_engine][meta search engine]]. Though there are many meta search engine out there but they don't provide gaurantee security of the their search engine which is essential because sometimes privacy is related to security like for example some memory vulnerabilities can leak private or sensitive information which is never good so the project being written in rust gaurantees memory safety and thus eliminating such problems.
|
49 |
|
50 |
** Why GPLv3?
|
51 |
|
|
|
1 |
* Websurfx
|
2 |
|
3 |
+
A lightening fast, privacy respecting, secure [[https://en.wikipedia.org/wiki/Metasearch_engine][meta search engine]]. (pronounced as websurface or web-surface /wɛbˈsɜːrfəs/.)
|
4 |
|
5 |
* Preview
|
6 |
|
|
|
45 |
|
46 |
** Why Websurfx?
|
47 |
|
48 |
+
The main goal of the project is to provide a fast, secure and privacy focused [[https://en.wikipedia.org/wiki/Metasearch_engine][meta search engine]]. Though there are many meta search engine out there but they don't provide gaurantee security of the their search engine which is essential because sometimes privacy is related to security like for example some memory vulnerabilities can leak private or sensitive information which is never good so the project being written in rust gaurantees memory safety and thus eliminating such problems and also many meta search engines lack many features like advanced image search *(which is required by may graphics designers, content creators, etc), proper nsfw blocking (many links are still visible even on strict safe search), etc which *websurfx* aims to provide.
|
49 |
|
50 |
** Why GPLv3?
|
51 |
|
src/engines/duckduckgo.rs
CHANGED
@@ -1,3 +1,7 @@
|
|
|
|
|
|
|
|
|
|
1 |
use std::collections::HashMap;
|
2 |
|
3 |
use reqwest::header::USER_AGENT;
|
@@ -5,10 +9,22 @@ use scraper::{Html, Selector};
|
|
5 |
|
6 |
use crate::search_results_handler::aggregation_models::RawSearchResult;
|
7 |
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
pub async fn results(
|
13 |
query: &str,
|
14 |
page: Option<u32>,
|
|
|
1 |
+
//! The `duckduckgo` module handles the scraping of results from the duckduckgo search engine
|
2 |
+
//! by querying the upstream duckduckgo search engine with user provided query and with a page
|
3 |
+
//! number if provided.
|
4 |
+
|
5 |
use std::collections::HashMap;
|
6 |
|
7 |
use reqwest::header::USER_AGENT;
|
|
|
9 |
|
10 |
use crate::search_results_handler::aggregation_models::RawSearchResult;
|
11 |
|
12 |
+
/// This function scrapes results from the upstream engine duckduckgo and puts all the scraped
|
13 |
+
/// results like title, visiting_url (href in html),engine (from which engine it was fetched from)
|
14 |
+
/// and description in a RawSearchResult and then adds that to HashMap whose keys are url and
|
15 |
+
/// values are RawSearchResult struct and then returns it within a Result enum.
|
16 |
+
///
|
17 |
+
/// # Arguments
|
18 |
+
///
|
19 |
+
/// * `query` - Takes the user provided query to query to the upstream search engine with.
|
20 |
+
/// * `page` - Takes an Option<u32> as argument which can be either None or a valid page number.
|
21 |
+
/// * `user_agent` - Takes a random user agent string as an argument.
|
22 |
+
///
|
23 |
+
/// # Errors
|
24 |
+
///
|
25 |
+
/// Returns a reqwest error if the user is not connected to the internet or if their is failure to
|
26 |
+
/// reach the above **upstream search engine** page and also returns error if the scraping
|
27 |
+
/// selector fails to initialize"
|
28 |
pub async fn results(
|
29 |
query: &str,
|
30 |
page: Option<u32>,
|
src/engines/searx.rs
CHANGED
@@ -1,3 +1,7 @@
|
|
|
|
|
|
|
|
|
|
1 |
use std::collections::HashMap;
|
2 |
|
3 |
use reqwest::header::USER_AGENT;
|
@@ -5,10 +9,22 @@ use scraper::{Html, Selector};
|
|
5 |
|
6 |
use crate::search_results_handler::aggregation_models::RawSearchResult;
|
7 |
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
pub async fn results(
|
13 |
query: &str,
|
14 |
page: Option<u32>,
|
|
|
1 |
+
//! The `searx` module handles the scraping of results from the searx search engine instance
|
2 |
+
//! by querying the upstream searx search engine instance with user provided query and with a page
|
3 |
+
//! number if provided.
|
4 |
+
|
5 |
use std::collections::HashMap;
|
6 |
|
7 |
use reqwest::header::USER_AGENT;
|
|
|
9 |
|
10 |
use crate::search_results_handler::aggregation_models::RawSearchResult;
|
11 |
|
12 |
+
/// This function scrapes results from the upstream engine duckduckgo and puts all the scraped
|
13 |
+
/// results like title, visiting_url (href in html),engine (from which engine it was fetched from)
|
14 |
+
/// and description in a RawSearchResult and then adds that to HashMap whose keys are url and
|
15 |
+
/// values are RawSearchResult struct and then returns it within a Result enum.
|
16 |
+
///
|
17 |
+
/// # Arguments
|
18 |
+
///
|
19 |
+
/// * `query` - Takes the user provided query to query to the upstream search engine with.
|
20 |
+
/// * `page` - Takes an Option<u32> as argument which can be either None or a valid page number.
|
21 |
+
/// * `user_agent` - Takes a random user agent string as an argument.
|
22 |
+
///
|
23 |
+
/// # Errors
|
24 |
+
///
|
25 |
+
/// Returns a reqwest error if the user is not connected to the internet or if their is failure to
|
26 |
+
/// reach the above **upstream search engine** page and also returns error if the scraping
|
27 |
+
/// selector fails to initialize"
|
28 |
pub async fn results(
|
29 |
query: &str,
|
30 |
page: Option<u32>,
|