Spaces:
Sleeping
Sleeping
Rohil Bansal
commited on
Commit
·
dd0bddd
1
Parent(s):
c02e015
Scraper improved.
Browse files- course_search/scraper/__pycache__/__init__.cpython-311.pyc +0 -0
- course_search/scraper/__pycache__/__init__.cpython-312.pyc +0 -0
- course_search/scraper/__pycache__/course_scraper.cpython-311.pyc +0 -0
- course_search/scraper/__pycache__/course_scraper.cpython-312.pyc +0 -0
- course_search/scraper/course_scraper.py +1 -1
- course_search/search_system/__pycache__/__init__.cpython-311.pyc +0 -0
- course_search/search_system/__pycache__/__init__.cpython-312.pyc +0 -0
- course_search/search_system/__pycache__/data_pipeline.cpython-311.pyc +0 -0
- course_search/search_system/__pycache__/data_pipeline.cpython-312.pyc +0 -0
- course_search/search_system/__pycache__/embeddings.cpython-311.pyc +0 -0
- course_search/search_system/__pycache__/embeddings.cpython-312.pyc +0 -0
- course_search/search_system/__pycache__/rag_system.cpython-311.pyc +0 -0
- course_search/search_system/__pycache__/vector_store.cpython-311.pyc +0 -0
- course_search/search_system/__pycache__/vector_store.cpython-312.pyc +0 -0
- data/cache/course_embeddings.npy +3 -0
- data/cache/faiss_index.bin +3 -0
- data/courses.pkl +3 -0
- data/courses_with_embeddings.pkl +2 -2
- data/embedding_cache/embeddings_cache_all-MiniLM-L6-v2.pkl +2 -2
course_search/scraper/__pycache__/__init__.cpython-311.pyc
CHANGED
Binary files a/course_search/scraper/__pycache__/__init__.cpython-311.pyc and b/course_search/scraper/__pycache__/__init__.cpython-311.pyc differ
|
|
course_search/scraper/__pycache__/__init__.cpython-312.pyc
DELETED
Binary file (161 Bytes)
|
|
course_search/scraper/__pycache__/course_scraper.cpython-311.pyc
CHANGED
Binary files a/course_search/scraper/__pycache__/course_scraper.cpython-311.pyc and b/course_search/scraper/__pycache__/course_scraper.cpython-311.pyc differ
|
|
course_search/scraper/__pycache__/course_scraper.cpython-312.pyc
DELETED
Binary file (5.03 kB)
|
|
course_search/scraper/course_scraper.py
CHANGED
@@ -62,7 +62,7 @@ class CourseScraper:
|
|
62 |
course_info['title'] = title_elem.text.strip()
|
63 |
|
64 |
# Extract description
|
65 |
-
desc_elem = soup.find('div', class_='rich-
|
66 |
if desc_elem:
|
67 |
course_info['description'] = desc_elem.text.strip()
|
68 |
|
|
|
62 |
course_info['title'] = title_elem.text.strip()
|
63 |
|
64 |
# Extract description
|
65 |
+
desc_elem = soup.find('div', class_='rich-text section-height__medium')
|
66 |
if desc_elem:
|
67 |
course_info['description'] = desc_elem.text.strip()
|
68 |
|
course_search/search_system/__pycache__/__init__.cpython-311.pyc
CHANGED
Binary files a/course_search/search_system/__pycache__/__init__.cpython-311.pyc and b/course_search/search_system/__pycache__/__init__.cpython-311.pyc differ
|
|
course_search/search_system/__pycache__/__init__.cpython-312.pyc
DELETED
Binary file (167 Bytes)
|
|
course_search/search_system/__pycache__/data_pipeline.cpython-311.pyc
CHANGED
Binary files a/course_search/search_system/__pycache__/data_pipeline.cpython-311.pyc and b/course_search/search_system/__pycache__/data_pipeline.cpython-311.pyc differ
|
|
course_search/search_system/__pycache__/data_pipeline.cpython-312.pyc
DELETED
Binary file (2.6 kB)
|
|
course_search/search_system/__pycache__/embeddings.cpython-311.pyc
CHANGED
Binary files a/course_search/search_system/__pycache__/embeddings.cpython-311.pyc and b/course_search/search_system/__pycache__/embeddings.cpython-311.pyc differ
|
|
course_search/search_system/__pycache__/embeddings.cpython-312.pyc
DELETED
Binary file (2.65 kB)
|
|
course_search/search_system/__pycache__/rag_system.cpython-311.pyc
CHANGED
Binary files a/course_search/search_system/__pycache__/rag_system.cpython-311.pyc and b/course_search/search_system/__pycache__/rag_system.cpython-311.pyc differ
|
|
course_search/search_system/__pycache__/vector_store.cpython-311.pyc
CHANGED
Binary files a/course_search/search_system/__pycache__/vector_store.cpython-311.pyc and b/course_search/search_system/__pycache__/vector_store.cpython-311.pyc differ
|
|
course_search/search_system/__pycache__/vector_store.cpython-312.pyc
DELETED
Binary file (4.05 kB)
|
|
data/cache/course_embeddings.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f4d91de39bffed6c0291356791966feaaf2314c2d3391975038eec443e570f8
|
3 |
+
size 40064
|
data/cache/faiss_index.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:406a21e75f0af486e6bef5f8270daf482f2b7a7c1cf45d1091aabc25455a4ba7
|
3 |
+
size 39981
|
data/courses.pkl
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f2945792f4e7f95855e4f48bcf3977aa52ec3156242e70efcaaf07f44864267a
|
3 |
+
size 77762
|
data/courses_with_embeddings.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ee32928e02155e79620c72b032cd166fefc8de20dd648b65cea7aa3a6b2641bf
|
3 |
+
size 118516
|
data/embedding_cache/embeddings_cache_all-MiniLM-L6-v2.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c05df73cb5ed569e2c252f4b6fed91aec245f8634b7023cfd134dfd58bb3024a
|
3 |
+
size 166317
|