Spaces:
Build error
Build error
freemt
commited on
Commit
•
5590f17
1
Parent(s):
14e63b2
Update docs
Browse files- data/resurrection-en.txt +0 -0
- data/resurrection-zh.txt +0 -0
- docs/build/doctrees/environment.pickle +0 -0
- docs/build/doctrees/examples.doctree +0 -0
- docs/build/doctrees/userguide-zh.doctree +0 -0
- docs/build/doctrees/userguide.doctree +0 -0
- docs/build/html/_sources/examples.rst.txt +1 -1
- docs/build/html/_sources/userguide-zh.rst.txt +1 -0
- docs/build/html/_sources/userguide.rst.txt +2 -0
- docs/build/html/examples.html +1 -1
- docs/build/html/searchindex.js +1 -1
- docs/build/html/userguide-zh.html +1 -0
- docs/build/html/userguide.html +1 -0
- docs/source/examples.rst +1 -1
- docs/source/userguide.rst +1 -1
- gradio_queue.db +0 -0
- img/plt.png +0 -0
- radiobee/__main__.py +1 -0
- radiobee/gradiobee.py +31 -1
data/resurrection-en.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
data/resurrection-zh.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
docs/build/doctrees/environment.pickle
CHANGED
Binary files a/docs/build/doctrees/environment.pickle and b/docs/build/doctrees/environment.pickle differ
|
|
docs/build/doctrees/examples.doctree
CHANGED
Binary files a/docs/build/doctrees/examples.doctree and b/docs/build/doctrees/examples.doctree differ
|
|
docs/build/doctrees/userguide-zh.doctree
CHANGED
Binary files a/docs/build/doctrees/userguide-zh.doctree and b/docs/build/doctrees/userguide-zh.doctree differ
|
|
docs/build/doctrees/userguide.doctree
CHANGED
Binary files a/docs/build/doctrees/userguide.doctree and b/docs/build/doctrees/userguide.doctree differ
|
|
docs/build/html/_sources/examples.rst.txt
CHANGED
@@ -5,6 +5,6 @@ Examples
|
|
5 |
|
6 |
Installation/Usage:
|
7 |
*******************
|
8 |
-
As the package has not been published on PyPi yet, it CANNOT be
|
9 |
|
10 |
For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
|
|
|
5 |
|
6 |
Installation/Usage:
|
7 |
*******************
|
8 |
+
As the package has not been published on PyPi yet, it CANNOT be installed using pip.
|
9 |
|
10 |
For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
|
docs/build/html/_sources/userguide-zh.rst.txt
CHANGED
@@ -5,6 +5,7 @@
|
|
5 |
|
6 |
- ``radiobee`` 目前仅支持中英、英中对齐。
|
7 |
- ``radiobee`` 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 ``docx``, ``pdf``, ``srt``, ``html`` 等格式。
|
|
|
8 |
- 第二次上载文件前请点击"Clear"。
|
9 |
- ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: 一般无需理会这些参数。
|
10 |
- ``esp`` 和 ``min_samples`` 的建议值 -- ``esp`` (最小 ``epsilon``): 8-12, ``min_samples``: 4-8.
|
|
|
5 |
|
6 |
- ``radiobee`` 目前仅支持中英、英中对齐。
|
7 |
- ``radiobee`` 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 ``docx``, ``pdf``, ``srt``, ``html`` 等格式。
|
8 |
+
- ``file 2`` 为空白时,``radiobee`` 则会视 ``file 1`` 为中英文混合文本及试着分离中英文,然后进行对齐。
|
9 |
- 第二次上载文件前请点击"Clear"。
|
10 |
- ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: 一般无需理会这些参数。
|
11 |
- ``esp`` 和 ``min_samples`` 的建议值 -- ``esp`` (最小 ``epsilon``): 8-12, ``min_samples``: 4-8.
|
docs/build/html/_sources/userguide.rst.txt
CHANGED
@@ -4,6 +4,8 @@ How to use
|
|
4 |
- ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
|
5 |
|
6 |
- Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
|
|
|
|
|
7 |
- Click "Clear" first for subsequent submits when uploading files.
|
8 |
- ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
|
9 |
- Suggested ``esp`` and ``min_samples`` values -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
|
|
|
4 |
- ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
|
5 |
|
6 |
- Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
|
7 |
+
- If ``file 2`` is left blank, ``radiobee`` will treat ``file 1`` as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.
|
8 |
+
|
9 |
- Click "Clear" first for subsequent submits when uploading files.
|
10 |
- ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
|
11 |
- Suggested ``esp`` and ``min_samples`` values -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
|
docs/build/html/examples.html
CHANGED
@@ -78,7 +78,7 @@
|
|
78 |
<p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> has in-built examples. Just click one of the rows in the <code class="docutils literal notranslate"><span class="pre">Examples</span></code> table and click <code class="docutils literal notranslate"><span class="pre">Submit</span></code> to testrun.</p>
|
79 |
<section id="installation-usage">
|
80 |
<h2>Installation/Usage:<a class="headerlink" href="#installation-usage" title="Permalink to this headline"></a></h2>
|
81 |
-
<p>As the package has not been published on PyPi yet, it CANNOT be
|
82 |
<p>For now, the suggested method is to download the zipped package or use the online version at <a class="reference external" href="https://huggingface.co/spaces/mikeee/radiobee-aligner/">https://huggingface.co/spaces/mikeee/radiobee-aligner/</a></p>
|
83 |
</section>
|
84 |
</section>
|
|
|
78 |
<p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> has in-built examples. Just click one of the rows in the <code class="docutils literal notranslate"><span class="pre">Examples</span></code> table and click <code class="docutils literal notranslate"><span class="pre">Submit</span></code> to testrun.</p>
|
79 |
<section id="installation-usage">
|
80 |
<h2>Installation/Usage:<a class="headerlink" href="#installation-usage" title="Permalink to this headline"></a></h2>
|
81 |
+
<p>As the package has not been published on PyPi yet, it CANNOT be installed using pip.</p>
|
82 |
<p>For now, the suggested method is to download the zipped package or use the online version at <a class="reference external" href="https://huggingface.co/spaces/mikeee/radiobee-aligner/">https://huggingface.co/spaces/mikeee/radiobee-aligner/</a></p>
|
83 |
</section>
|
84 |
</section>
|
docs/build/html/searchindex.js
CHANGED
@@ -1 +1 @@
|
|
1 |
-
Search.setIndex({docnames:["examples","index","intro","modules","radiobee","userguide","userguide-zh"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["examples.rst","index.rst","intro.rst","modules.rst","radiobee.rst","userguide.rst","userguide-zh.rst"],objects:{},objnames:{},objtypes:{},terms:{"12":[5,6],"3":2,"316287378":[5,6],"4":[5,6],"8":[5,6],"\u4e00\u822c\u65e0\u9700\u7406\u4f1a\u8fd9\u4e9b\u53c2\u6570":6,"\u4e86\u89e3\u8fd9\u4e9b\u5bf9\u9f50\u5de5\u5177":6,"\u4ee5\u540e\u53ef\u80fd\u4f1a\u652f\u6301":6,"\u4f18\u8d28\u5bf9":6,"\u4f7f\u7528\u8bf4\u660e":1,"\u53e6\u4e00\u65b9\u9762":6,"\u53ef\u4ee5\u4ee5\u540e\u4f1a\u652f\u6301":[],"\u53ef\u4ee5\u53f3\u51fb\u62f7\u51fa\u56fe\u7684\u94fe\u63a5\u7528\u6d4f\u89c8\u5668\u72ec\u7acb\u8bbf\u95ee\u62f7\u51fa\u6765\u7684\u94fe\u63a5\u6216\u53f3\u51fb\u5b58\u76d8\u518d\u7528\u770b\u56fe\u7a0b\u5e8f\u6253\u5f00\u5b58\u76d8\u7684\u56fe\u6587\u4ef6":6,"\u548c":6,"\u5acc\u56fe\u592a\u5c0f\u7684\u8bdd":6,"\u5b58\u4e0b\u6709\u5173\u53c2\u6570\u67e5\u770b\u6216\u901a\u77e5\u5f00\u53d1\u8005":6,"\u662f":6,"\u6700\u5c0f":6,"\u7684\u5b6a\u751f\u5144\u5f1f":6,"\u7684\u5efa\u8bae\u503c":6,"\u76ee\u524d\u4ec5\u652f\u6301\u4e2d\u82f1":6,"\u76ee\u524d\u4ec5\u652f\u6301\u7eaf\u6587\u672c\u6587\u4ef6\u4e0a\u8f7d":6,"\u7b2c\u4e8c\u6b21\u4e0a\u8f7d\u6587\u4ef6\u524d\u8bf7\u70b9\u51fb":6,"\u7b49":6,"\u7b49\u683c\u5f0f":6,"\u82f1\u4e2d\u5bf9\u9f50":6,"\u8bbe\u5927\u4e9b\u5219\u4f1a\u5f97\u5230\u5c11\u4e00\u4e9b\u5bf9\u9f50\u5bf9\u56e0\u4e3a\u53ef\u80fd\u9519\u5931\u4e86\u4e00\u4e9b":6,"\u8bbe\u5927\u4e9b\u5219\u53ef\u80fd\u4f1a\u9519\u5931\u4e00\u4e9b":[],"\u8bbe\u5927\u4e9b\u6216":6,"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a":[],"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a\u6709\u66f4\u591a":6,"\u8bbe\u5c0f\u4e9b\u6216":6,"\u8bef\u62a5\u5bf9":6,"\u8bf7\u52a0\u5165qq\u7fa4":6,"\u8fd0\u884c\u51fa\u9519\u65f6\u53ef\u4ee5\u70b9\u51fb":6,"\u9519\u8bef\u5224\u65ad\u4e3a\u5bf9\u9f50\u7684\u5bf9":6,"\u9519\u8bef\u5bf9":[],"do":5,"new":5,As:0,For:0,If:[2,5],On:5,The:2,To:5,about:5,ad:2,address:5,aim:2,align:[0,2,5,6],align_s:[1,3],align_text:[1,3],also:5,although:2,amend_avec:[1,3],an:2,app:[1,3],applic:2,ar:[2,5],been:[0,2],better:5,browser:5,built:0,bumblebe:[5,6],can:5,candid:5,cannot:0,cat:2,clear:[5,6],click:[0,5],cmat2tset:[1,3],co:0,contact:2,content:3,copi:5,csv:[5,6],current:2,de:2,develop:[2,5],dl_type:[5,6],docterm_scor:[1,3],docx:[5,6],download:0,dual:2,dualtext:2,e:2,ebook:2,educ:2,en2zh:[1,3],en2zh_token:[1,3],en:2,epsilon:[5,6],esp:[5,6],etc:[2,5],exampl:[1,2],fals:5,file2text:[1,3],file:5,files2df:[1,3],find:2,first:5,flag:[5,6],format:5,full:2,further:2,g:2,gen_aset:[1,3],gen_eps_minsampl:[1,3],gen_model:[1,3],gen_pset:[1,3],gen_row_align:[1,3],go:5,good:5,gradio:2,group:5,ha:[0,2],hand:5,have:5,help:2,here:2,how:1,html:[5,6],http:0,huggingfac:0,identifi:5,idf_typ:[5,6],imag:5,implement:2,index:1,inform:5,insert_spac:[1,3],instal:1,interfac:2,interpolate_pset:[1,3],introduct:1,ja:2,join:5,just:0,know:5,languag:2,larger:5,later:5,learn:2,limit:1,lists2cmat:[1,3],loadtext:[1,3],look:5,machin:2,mai:5,md:[5,6],mdx_e2c:[1,3],method:0,mikee:0,min_sampl:[5,6],minimum:5,miss:5,modul:[1,3],more:5,motiv:1,need:5,norm:[5,6],normal:5,now:0,one:0,onli:2,onlin:0,open:5,other:5,output:5,packag:[0,1,3],page:1,pair:[2,5],paragraph:2,particular:2,pdf:[5,6],permit:2,pip:0,pleas:5,plot_cmat:[1,3],plot_df:[1,3],posit:5,power:2,process_upload:[1,3],properli:2,provid:2,publish:0,pure:5,pypi:0,python:2,qq:5,radiobe:[0,2,5,6],result:5,right:5,row:0,ru:2,save:5,search:1,seg_text:[1,3],select:5,sentenc:2,should:5,shuffle_s:[1,3],sibl:5,smaller:5,smatrix:[1,3],someth:5,space:0,srt:[5,6],submit:[0,5],submodul:[1,3],subsequ:5,suggest:[0,5],support:[2,5],tab:5,tabl:0,tend:5,term:2,testrun:0,text:[2,5],tf_type:[5,6],time:2,tmx:2,touch:5,translat:2,trim_df:[1,3],two:2,txt:[5,6],unless:5,upload:5,us:[0,1],usag:1,valu:5,version:0,wa:2,welcom:2,what:5,when:[2,5],willing:2,wrong:5,yet:0,you:[2,5],zh:2,zip:0},titles:["Examples","Welcome to radiobee\u2019s documentation!","Introduction","radiobee","radiobee package","How to use","\u4f7f\u7528\u8bf4\u660e"],titleterms:{"\u4f7f\u7528\u8bf4\u660e":6,align_s:4,align_text:4,amend_avec:4,app:4,cmat2tset:4,content:[1,4],docterm_scor:4,document:1,en2zh:4,en2zh_token:4,exampl:0,file2text:4,files2df:4,gen_aset:4,gen_eps_minsampl:4,gen_model:4,gen_pset:4,gen_row_align:4,how:5,indic:1,insert_spac:4,instal:0,interpolate_pset:4,introduct:2,limit:2,lists2cmat:4,loadtext:4,mdx_e2c:4,modul:4,motiv:2,packag:4,plot_cmat:4,plot_df:4,process_upload:4,radiobe:[1,3,4],s:1,seg_text:4,shuffle_s:4,smatrix:4,submodul:4,tabl:1,trim_df:4,us:5,usag:0,welcom:1}})
|
|
|
1 |
+
Search.setIndex({docnames:["examples","index","intro","modules","radiobee","userguide","userguide-zh"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["examples.rst","index.rst","intro.rst","modules.rst","radiobee.rst","userguide.rst","userguide-zh.rst"],objects:{},objnames:{},objtypes:{},terms:{"1":[5,6],"12":[5,6],"2":[5,6],"3":2,"316287378":[5,6],"4":[5,6],"8":[5,6],"\u4e00\u822c\u65e0\u9700\u7406\u4f1a\u8fd9\u4e9b\u53c2\u6570":6,"\u4e3a\u4e2d\u82f1\u6587\u6df7\u5408\u6587\u672c\u53ca\u8bd5\u7740\u5206\u79bb\u4e2d\u82f1\u6587":6,"\u4e3a\u7a7a\u767d\u65f6":6,"\u4e86\u89e3\u8fd9\u4e9b\u5bf9\u9f50\u5de5\u5177":6,"\u4ee5\u540e\u53ef\u80fd\u4f1a\u652f\u6301":6,"\u4f18\u8d28\u5bf9":6,"\u4f7f\u7528\u8bf4\u660e":1,"\u5219\u4f1a\u89c6":6,"\u53e6\u4e00\u65b9\u9762":6,"\u53ef\u4ee5\u4ee5\u540e\u4f1a\u652f\u6301":[],"\u53ef\u4ee5\u53f3\u51fb\u62f7\u51fa\u56fe\u7684\u94fe\u63a5\u7528\u6d4f\u89c8\u5668\u72ec\u7acb\u8bbf\u95ee\u62f7\u51fa\u6765\u7684\u94fe\u63a5\u6216\u53f3\u51fb\u5b58\u76d8\u518d\u7528\u770b\u56fe\u7a0b\u5e8f\u6253\u5f00\u5b58\u76d8\u7684\u56fe\u6587\u4ef6":6,"\u548c":6,"\u5acc\u56fe\u592a\u5c0f\u7684\u8bdd":6,"\u5b58\u4e0b\u6709\u5173\u53c2\u6570\u67e5\u770b\u6216\u901a\u77e5\u5f00\u53d1\u8005":6,"\u662f":6,"\u6700\u5c0f":6,"\u7136\u540e\u8fdb\u884c\u5bf9\u9f50":6,"\u7684\u5b6a\u751f\u5144\u5f1f":6,"\u7684\u5efa\u8bae\u503c":6,"\u76ee\u524d\u4ec5\u652f\u6301\u4e2d\u82f1":6,"\u76ee\u524d\u4ec5\u652f\u6301\u7eaf\u6587\u672c\u6587\u4ef6\u4e0a\u8f7d":6,"\u7b2c\u4e8c\u6b21\u4e0a\u8f7d\u6587\u4ef6\u524d\u8bf7\u70b9\u51fb":6,"\u7b49":6,"\u7b49\u683c\u5f0f":6,"\u82f1\u4e2d\u5bf9\u9f50":6,"\u8bbe\u5927\u4e9b\u5219\u4f1a\u5f97\u5230\u5c11\u4e00\u4e9b\u5bf9\u9f50\u5bf9\u56e0\u4e3a\u53ef\u80fd\u9519\u5931\u4e86\u4e00\u4e9b":6,"\u8bbe\u5927\u4e9b\u5219\u53ef\u80fd\u4f1a\u9519\u5931\u4e00\u4e9b":[],"\u8bbe\u5927\u4e9b\u6216":6,"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a":[],"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a\u6709\u66f4\u591a":6,"\u8bbe\u5c0f\u4e9b\u6216":6,"\u8bef\u62a5\u5bf9":6,"\u8bf7\u52a0\u5165qq\u7fa4":6,"\u8fd0\u884c\u51fa\u9519\u65f6\u53ef\u4ee5\u70b9\u51fb":6,"\u9519\u8bef\u5224\u65ad\u4e3a\u5bf9\u9f50\u7684\u5bf9":6,"\u9519\u8bef\u5bf9":[],"do":5,"new":5,As:0,For:0,If:[2,5],On:5,The:2,To:5,about:5,ad:2,address:5,aim:2,align:[0,2,5,6],align_s:[1,3],align_text:[1,3],also:5,although:2,amend_avec:[1,3],an:2,app:[1,3],applic:2,ar:[2,5],attempt:5,been:[0,2],befor:5,better:5,blank:5,browser:5,built:0,bumblebe:[5,6],can:5,candid:5,cannot:0,cat:2,chines:5,clear:[5,6],click:[0,5],cmat2tset:[1,3],co:0,contact:2,content:3,copi:5,csv:[5,6],current:2,de:2,develop:[2,5],dl_type:[5,6],docterm_scor:[1,3],docx:[5,6],download:0,dual:2,dualtext:2,e:2,ebook:2,educ:2,en2zh:[1,3],en2zh_token:[1,3],en:2,english:5,epsilon:[5,6],esp:[5,6],etc:[2,5],exampl:[1,2],fals:5,file2text:[1,3],file:[5,6],files2df:[1,3],find:2,first:5,flag:[5,6],format:5,full:2,further:2,g:2,gen_aset:[1,3],gen_eps_minsampl:[1,3],gen_model:[1,3],gen_pset:[1,3],gen_row_align:[1,3],go:5,good:5,gradio:2,group:5,ha:[0,2],hand:5,have:5,help:2,here:2,how:1,html:[5,6],http:0,huggingfac:0,identifi:5,idf_typ:[5,6],imag:5,implement:2,index:1,inform:5,insert_spac:[1,3],instal:1,interfac:2,interpolate_pset:[1,3],introduct:1,ja:2,join:5,just:0,know:5,languag:2,larger:5,later:5,learn:2,left:5,limit:1,lists2cmat:[1,3],loadtext:[1,3],look:5,machin:2,mai:5,md:[5,6],mdx_e2c:[1,3],method:0,mikee:0,min_sampl:[5,6],minimum:5,miss:5,mix:5,modul:[1,3],more:5,motiv:1,need:5,norm:[5,6],normal:5,now:0,one:0,onli:2,onlin:0,open:5,other:5,output:5,packag:[0,1,3],page:1,pair:[2,5],paragraph:2,particular:2,pdf:[5,6],permit:2,pip:0,pleas:5,plot_cmat:[1,3],plot_df:[1,3],posit:5,power:2,proced:5,process_upload:[1,3],properli:2,provid:2,publish:0,pure:5,pypi:0,python:2,qq:5,radiobe:[0,2,5,6],result:5,right:5,row:0,ru:2,save:5,search:1,seg_text:[1,3],select:5,sentenc:2,separ:5,should:5,shuffle_s:[1,3],sibl:5,smaller:5,smatrix:[1,3],someth:5,space:0,srt:[5,6],submit:[0,5],submodul:[1,3],subsequ:5,suggest:[0,5],support:[2,5],tab:5,tabl:0,tend:5,term:2,testrun:0,text:[2,5],tf_type:[5,6],them:5,time:2,tmx:2,touch:5,translat:2,treat:5,trim_df:[1,3],two:2,txt:[5,6],unless:5,upload:5,us:[0,1],usag:1,valu:5,version:0,wa:2,welcom:2,what:5,when:[2,5],willing:2,wrong:5,yet:0,you:[2,5],zh:2,zip:0},titles:["Examples","Welcome to radiobee\u2019s documentation!","Introduction","radiobee","radiobee package","How to use","\u4f7f\u7528\u8bf4\u660e"],titleterms:{"\u4f7f\u7528\u8bf4\u660e":6,align_s:4,align_text:4,amend_avec:4,app:4,cmat2tset:4,content:[1,4],docterm_scor:4,document:1,en2zh:4,en2zh_token:4,exampl:0,file2text:4,files2df:4,gen_aset:4,gen_eps_minsampl:4,gen_model:4,gen_pset:4,gen_row_align:4,how:5,indic:1,insert_spac:4,instal:0,interpolate_pset:4,introduct:2,limit:2,lists2cmat:4,loadtext:4,mdx_e2c:4,modul:4,motiv:2,packag:4,plot_cmat:4,plot_df:4,process_upload:4,radiobe:[1,3,4],s:1,seg_text:4,shuffle_s:4,smatrix:4,submodul:4,tabl:1,trim_df:4,us:5,usag:0,welcom:1}})
|
docs/build/html/userguide-zh.html
CHANGED
@@ -76,6 +76,7 @@
|
|
76 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> 是 <code class="docutils literal notranslate"><span class="pre">bumblebee</span> <span class="pre">aligner</span></code> 的孪生兄弟。请加入qq群 <code class="docutils literal notranslate"><span class="pre">316287378</span></code> 了解这些对齐工具。</p></li>
|
77 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持中英、英中对齐。</p></li>
|
78 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> 等格式。</p></li>
|
|
|
79 |
<li><p>第二次上载文件前请点击”Clear”。</p></li>
|
80 |
<li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: 一般无需理会这些参数。</p></li>
|
81 |
<li><p><code class="docutils literal notranslate"><span class="pre">esp</span></code> 和 <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> 的建议值 – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (最小 <code class="docutils literal notranslate"><span class="pre">epsilon</span></code>): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p>
|
|
|
76 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> 是 <code class="docutils literal notranslate"><span class="pre">bumblebee</span> <span class="pre">aligner</span></code> 的孪生兄弟。请加入qq群 <code class="docutils literal notranslate"><span class="pre">316287378</span></code> 了解这些对齐工具。</p></li>
|
77 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持中英、英中对齐。</p></li>
|
78 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> 等格式。</p></li>
|
79 |
+
<li><p><code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">2</span></code> 为空白时,<code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 则会视 <code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">1</span></code> 为中英文混合文本及试着分离中英文,然后进行对齐。</p></li>
|
80 |
<li><p>第二次上载文件前请点击”Clear”。</p></li>
|
81 |
<li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: 一般无需理会这些参数。</p></li>
|
82 |
<li><p><code class="docutils literal notranslate"><span class="pre">esp</span></code> 和 <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> 的建议值 – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (最小 <code class="docutils literal notranslate"><span class="pre">epsilon</span></code>): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p>
|
docs/build/html/userguide.html
CHANGED
@@ -75,6 +75,7 @@
|
|
75 |
<ul class="simple">
|
76 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> is a sibling of <cite>bumblebee aligner</cite>. To know more about these aligners, please join qq group <cite>316287378</cite>.</p></li>
|
77 |
<li><p>Uploaded files should be in pure text format (txt, md, csv etc). <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> etc may be supported later on.</p></li>
|
|
|
78 |
<li><p>Click “Clear” first for subsequent submits when uploading files.</p></li>
|
79 |
<li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: Normally there is no need to touch these unless you know what you are doing.</p></li>
|
80 |
<li><p>Suggested <code class="docutils literal notranslate"><span class="pre">esp</span></code> and <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> values – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (minimum epsilon): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p></li>
|
|
|
75 |
<ul class="simple">
|
76 |
<li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> is a sibling of <cite>bumblebee aligner</cite>. To know more about these aligners, please join qq group <cite>316287378</cite>.</p></li>
|
77 |
<li><p>Uploaded files should be in pure text format (txt, md, csv etc). <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> etc may be supported later on.</p></li>
|
78 |
+
<li><p>If <code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">2</span></code> is left blank, <code class="docutils literal notranslate"><span class="pre">radiobee</span></code> will treat <code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">1</span></code> as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.</p></li>
|
79 |
<li><p>Click “Clear” first for subsequent submits when uploading files.</p></li>
|
80 |
<li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: Normally there is no need to touch these unless you know what you are doing.</p></li>
|
81 |
<li><p>Suggested <code class="docutils literal notranslate"><span class="pre">esp</span></code> and <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> values – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (minimum epsilon): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p></li>
|
docs/source/examples.rst
CHANGED
@@ -5,6 +5,6 @@ Examples
|
|
5 |
|
6 |
Installation/Usage:
|
7 |
*******************
|
8 |
-
As the package has not been published on PyPi yet, it CANNOT be
|
9 |
|
10 |
For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
|
|
|
5 |
|
6 |
Installation/Usage:
|
7 |
*******************
|
8 |
+
As the package has not been published on PyPi yet, it CANNOT be installed using pip.
|
9 |
|
10 |
For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
|
docs/source/userguide.rst
CHANGED
@@ -4,7 +4,7 @@ How to use
|
|
4 |
- ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
|
5 |
|
6 |
- Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
|
7 |
-
- If ``file 2`` is left blank, ``
|
8 |
|
9 |
- Click "Clear" first for subsequent submits when uploading files.
|
10 |
- ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
|
|
|
4 |
- ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
|
5 |
|
6 |
- Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
|
7 |
+
- If ``file 2`` is left blank, ``radiobee`` will treat ``file 1`` as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.
|
8 |
|
9 |
- Click "Clear" first for subsequent submits when uploading files.
|
10 |
- ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
|
gradio_queue.db
CHANGED
Binary files a/gradio_queue.db and b/gradio_queue.db differ
|
|
img/plt.png
CHANGED
radiobee/__main__.py
CHANGED
@@ -280,6 +280,7 @@ if __name__ == "__main__":
|
|
280 |
out_file_dl,
|
281 |
out_file_dl_excel,
|
282 |
out_df_aligned,
|
|
|
283 |
]
|
284 |
# outputs = ["dataframe", "plot", "plot"] # wont work
|
285 |
# outputs = ["dataframe"]
|
|
|
280 |
out_file_dl,
|
281 |
out_file_dl_excel,
|
282 |
out_df_aligned,
|
283 |
+
gr.outputs.HTML(),
|
284 |
]
|
285 |
# outputs = ["dataframe", "plot", "plot"] # wont work
|
286 |
# outputs = ["dataframe"]
|
radiobee/gradiobee.py
CHANGED
@@ -28,6 +28,7 @@ from radiobee.text2lists import text2lists
|
|
28 |
|
29 |
sns.set()
|
30 |
sns.set_style("darkgrid")
|
|
|
31 |
|
32 |
debug = False
|
33 |
debug = True
|
@@ -313,6 +314,33 @@ def gradiobee(
|
|
313 |
df_aligned = df_aligned[["text2", "text1", "likelihood"]]
|
314 |
df_aligned.columns = ["text1", "text2", "likelihood"]
|
315 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
316 |
# ===
|
317 |
if plot_dia:
|
318 |
output_plot = "img/plt.png"
|
@@ -330,6 +358,8 @@ def gradiobee(
|
|
330 |
# return df_trimmed, plt, file_dl, file_dl_xlsx, df_aligned
|
331 |
|
332 |
# output_plot: gr.outputs.Image(type="auto", label="...")
|
333 |
-
return df_trimmed, output_plot, file_dl, file_dl_xlsx, df_aligned
|
|
|
|
|
334 |
|
335 |
# modi outputs
|
|
|
28 |
|
29 |
sns.set()
|
30 |
sns.set_style("darkgrid")
|
31 |
+
pd.options.display.float_format = "{:,.2f}".format
|
32 |
|
33 |
debug = False
|
34 |
debug = True
|
|
|
314 |
df_aligned = df_aligned[["text2", "text1", "likelihood"]]
|
315 |
df_aligned.columns = ["text1", "text2", "likelihood"]
|
316 |
|
317 |
+
# round the last column to 2
|
318 |
+
# df_aligned.likelihood = df_aligned.likelihood.round(2)
|
319 |
+
# df_aligned = df_aligned.round({"likelihood": 2})
|
320 |
+
|
321 |
+
# df_aligned.likelihood = df_aligned.likelihood.apply(lambda x: np.nan if x in [""] else x)
|
322 |
+
|
323 |
+
# style
|
324 |
+
styled = df_aligned.style.set_properties(
|
325 |
+
**{
|
326 |
+
"font-size": "10pt",
|
327 |
+
"border-color": "black",
|
328 |
+
"border": "1px black solid !important"
|
329 |
+
}
|
330 |
+
# border-color="black",
|
331 |
+
).set_table_styles([{
|
332 |
+
"selector": "", # noqs
|
333 |
+
"props": [("border", "2px black solid !important")]}] # noqs
|
334 |
+
).format(
|
335 |
+
precision=2
|
336 |
+
)
|
337 |
+
# .bar(subset="likelihood", color="#5fba7d")
|
338 |
+
|
339 |
+
# .background_gradient("Greys")
|
340 |
+
|
341 |
+
# df_html = df_aligned.to_html()
|
342 |
+
df_html = styled.to_html()
|
343 |
+
|
344 |
# ===
|
345 |
if plot_dia:
|
346 |
output_plot = "img/plt.png"
|
|
|
358 |
# return df_trimmed, plt, file_dl, file_dl_xlsx, df_aligned
|
359 |
|
360 |
# output_plot: gr.outputs.Image(type="auto", label="...")
|
361 |
+
# return df_trimmed, output_plot, file_dl, file_dl_xlsx, df_aligned
|
362 |
+
# return df_trimmed, output_plot, file_dl, file_dl_xlsx, styled, df_html # gradio cant handle style
|
363 |
+
return df_trimmed, output_plot, file_dl, file_dl_xlsx, df_aligned, df_html
|
364 |
|
365 |
# modi outputs
|