freemt commited on
Commit
5590f17
1 Parent(s): 14e63b2

Update docs

Browse files
data/resurrection-en.txt ADDED
The diff for this file is too large to render. See raw diff
 
data/resurrection-zh.txt ADDED
The diff for this file is too large to render. See raw diff
 
docs/build/doctrees/environment.pickle CHANGED
Binary files a/docs/build/doctrees/environment.pickle and b/docs/build/doctrees/environment.pickle differ
 
docs/build/doctrees/examples.doctree CHANGED
Binary files a/docs/build/doctrees/examples.doctree and b/docs/build/doctrees/examples.doctree differ
 
docs/build/doctrees/userguide-zh.doctree CHANGED
Binary files a/docs/build/doctrees/userguide-zh.doctree and b/docs/build/doctrees/userguide-zh.doctree differ
 
docs/build/doctrees/userguide.doctree CHANGED
Binary files a/docs/build/doctrees/userguide.doctree and b/docs/build/doctrees/userguide.doctree differ
 
docs/build/html/_sources/examples.rst.txt CHANGED
@@ -5,6 +5,6 @@ Examples
5
 
6
  Installation/Usage:
7
  *******************
8
- As the package has not been published on PyPi yet, it CANNOT be install using pip.
9
 
10
  For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
 
5
 
6
  Installation/Usage:
7
  *******************
8
+ As the package has not been published on PyPi yet, it CANNOT be installed using pip.
9
 
10
  For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
docs/build/html/_sources/userguide-zh.rst.txt CHANGED
@@ -5,6 +5,7 @@
5
 
6
  - ``radiobee`` 目前仅支持中英、英中对齐。
7
  - ``radiobee`` 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 ``docx``, ``pdf``, ``srt``, ``html`` 等格式。
 
8
  - 第二次上载文件前请点击"Clear"。
9
  - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: 一般无需理会这些参数。
10
  - ``esp`` 和 ``min_samples`` 的建议值 -- ``esp`` (最小 ``epsilon``): 8-12, ``min_samples``: 4-8.
 
5
 
6
  - ``radiobee`` 目前仅支持中英、英中对齐。
7
  - ``radiobee`` 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 ``docx``, ``pdf``, ``srt``, ``html`` 等格式。
8
+ - ``file 2`` 为空白时,``radiobee`` 则会视 ``file 1`` 为中英文混合文本及试着分离中英文,然后进行对齐。
9
  - 第二次上载文件前请点击"Clear"。
10
  - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: 一般无需理会这些参数。
11
  - ``esp`` 和 ``min_samples`` 的建议值 -- ``esp`` (最小 ``epsilon``): 8-12, ``min_samples``: 4-8.
docs/build/html/_sources/userguide.rst.txt CHANGED
@@ -4,6 +4,8 @@ How to use
4
  - ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
5
 
6
  - Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
 
 
7
  - Click "Clear" first for subsequent submits when uploading files.
8
  - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
9
  - Suggested ``esp`` and ``min_samples`` values -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
 
4
  - ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
5
 
6
  - Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
7
+ - If ``file 2`` is left blank, ``radiobee`` will treat ``file 1`` as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.
8
+
9
  - Click "Clear" first for subsequent submits when uploading files.
10
  - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
11
  - Suggested ``esp`` and ``min_samples`` values -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
docs/build/html/examples.html CHANGED
@@ -78,7 +78,7 @@
78
  <p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> has in-built examples. Just click one of the rows in the <code class="docutils literal notranslate"><span class="pre">Examples</span></code> table and click <code class="docutils literal notranslate"><span class="pre">Submit</span></code> to testrun.</p>
79
  <section id="installation-usage">
80
  <h2>Installation/Usage:<a class="headerlink" href="#installation-usage" title="Permalink to this headline"></a></h2>
81
- <p>As the package has not been published on PyPi yet, it CANNOT be install using pip.</p>
82
  <p>For now, the suggested method is to download the zipped package or use the online version at <a class="reference external" href="https://huggingface.co/spaces/mikeee/radiobee-aligner/">https://huggingface.co/spaces/mikeee/radiobee-aligner/</a></p>
83
  </section>
84
  </section>
 
78
  <p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> has in-built examples. Just click one of the rows in the <code class="docutils literal notranslate"><span class="pre">Examples</span></code> table and click <code class="docutils literal notranslate"><span class="pre">Submit</span></code> to testrun.</p>
79
  <section id="installation-usage">
80
  <h2>Installation/Usage:<a class="headerlink" href="#installation-usage" title="Permalink to this headline"></a></h2>
81
+ <p>As the package has not been published on PyPi yet, it CANNOT be installed using pip.</p>
82
  <p>For now, the suggested method is to download the zipped package or use the online version at <a class="reference external" href="https://huggingface.co/spaces/mikeee/radiobee-aligner/">https://huggingface.co/spaces/mikeee/radiobee-aligner/</a></p>
83
  </section>
84
  </section>
docs/build/html/searchindex.js CHANGED
@@ -1 +1 @@
1
- Search.setIndex({docnames:["examples","index","intro","modules","radiobee","userguide","userguide-zh"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["examples.rst","index.rst","intro.rst","modules.rst","radiobee.rst","userguide.rst","userguide-zh.rst"],objects:{},objnames:{},objtypes:{},terms:{"12":[5,6],"3":2,"316287378":[5,6],"4":[5,6],"8":[5,6],"\u4e00\u822c\u65e0\u9700\u7406\u4f1a\u8fd9\u4e9b\u53c2\u6570":6,"\u4e86\u89e3\u8fd9\u4e9b\u5bf9\u9f50\u5de5\u5177":6,"\u4ee5\u540e\u53ef\u80fd\u4f1a\u652f\u6301":6,"\u4f18\u8d28\u5bf9":6,"\u4f7f\u7528\u8bf4\u660e":1,"\u53e6\u4e00\u65b9\u9762":6,"\u53ef\u4ee5\u4ee5\u540e\u4f1a\u652f\u6301":[],"\u53ef\u4ee5\u53f3\u51fb\u62f7\u51fa\u56fe\u7684\u94fe\u63a5\u7528\u6d4f\u89c8\u5668\u72ec\u7acb\u8bbf\u95ee\u62f7\u51fa\u6765\u7684\u94fe\u63a5\u6216\u53f3\u51fb\u5b58\u76d8\u518d\u7528\u770b\u56fe\u7a0b\u5e8f\u6253\u5f00\u5b58\u76d8\u7684\u56fe\u6587\u4ef6":6,"\u548c":6,"\u5acc\u56fe\u592a\u5c0f\u7684\u8bdd":6,"\u5b58\u4e0b\u6709\u5173\u53c2\u6570\u67e5\u770b\u6216\u901a\u77e5\u5f00\u53d1\u8005":6,"\u662f":6,"\u6700\u5c0f":6,"\u7684\u5b6a\u751f\u5144\u5f1f":6,"\u7684\u5efa\u8bae\u503c":6,"\u76ee\u524d\u4ec5\u652f\u6301\u4e2d\u82f1":6,"\u76ee\u524d\u4ec5\u652f\u6301\u7eaf\u6587\u672c\u6587\u4ef6\u4e0a\u8f7d":6,"\u7b2c\u4e8c\u6b21\u4e0a\u8f7d\u6587\u4ef6\u524d\u8bf7\u70b9\u51fb":6,"\u7b49":6,"\u7b49\u683c\u5f0f":6,"\u82f1\u4e2d\u5bf9\u9f50":6,"\u8bbe\u5927\u4e9b\u5219\u4f1a\u5f97\u5230\u5c11\u4e00\u4e9b\u5bf9\u9f50\u5bf9\u56e0\u4e3a\u53ef\u80fd\u9519\u5931\u4e86\u4e00\u4e9b":6,"\u8bbe\u5927\u4e9b\u5219\u53ef\u80fd\u4f1a\u9519\u5931\u4e00\u4e9b":[],"\u8bbe\u5927\u4e9b\u6216":6,"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a":[],"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a\u6709\u66f4\u591a":6,"\u8bbe\u5c0f\u4e9b\u6216":6,"\u8bef\u62a5\u5bf9":6,"\u8bf7\u52a0\u5165qq\u7fa4":6,"\u8fd0\u884c\u51fa\u9519\u65f6\u53ef\u4ee5\u70b9\u51fb":6,"\u9519\u8bef\u5224\u65ad\u4e3a\u5bf9\u9f50\u7684\u5bf9":6,"\u9519\u8bef\u5bf9":[],"do":5,"new":5,As:0,For:0,If:[2,5],On:5,The:2,To:5,about:5,ad:2,address:5,aim:2,align:[0,2,5,6],align_s:[1,3],align_text:[1,3],also:5,although:2,amend_avec:[1,3],an:2,app:[1,3],applic:2,ar:[2,5],been:[0,2],better:5,browser:5,built:0,bumblebe:[5,6],can:5,candid:5,cannot:0,cat:2,clear:[5,6],click:[0,5],cmat2tset:[1,3],co:0,contact:2,content:3,copi:5,csv:[5,6],current:2,de:2,develop:[2,5],dl_type:[5,6],docterm_scor:[1,3],docx:[5,6],download:0,dual:2,dualtext:2,e:2,ebook:2,educ:2,en2zh:[1,3],en2zh_token:[1,3],en:2,epsilon:[5,6],esp:[5,6],etc:[2,5],exampl:[1,2],fals:5,file2text:[1,3],file:5,files2df:[1,3],find:2,first:5,flag:[5,6],format:5,full:2,further:2,g:2,gen_aset:[1,3],gen_eps_minsampl:[1,3],gen_model:[1,3],gen_pset:[1,3],gen_row_align:[1,3],go:5,good:5,gradio:2,group:5,ha:[0,2],hand:5,have:5,help:2,here:2,how:1,html:[5,6],http:0,huggingfac:0,identifi:5,idf_typ:[5,6],imag:5,implement:2,index:1,inform:5,insert_spac:[1,3],instal:1,interfac:2,interpolate_pset:[1,3],introduct:1,ja:2,join:5,just:0,know:5,languag:2,larger:5,later:5,learn:2,limit:1,lists2cmat:[1,3],loadtext:[1,3],look:5,machin:2,mai:5,md:[5,6],mdx_e2c:[1,3],method:0,mikee:0,min_sampl:[5,6],minimum:5,miss:5,modul:[1,3],more:5,motiv:1,need:5,norm:[5,6],normal:5,now:0,one:0,onli:2,onlin:0,open:5,other:5,output:5,packag:[0,1,3],page:1,pair:[2,5],paragraph:2,particular:2,pdf:[5,6],permit:2,pip:0,pleas:5,plot_cmat:[1,3],plot_df:[1,3],posit:5,power:2,process_upload:[1,3],properli:2,provid:2,publish:0,pure:5,pypi:0,python:2,qq:5,radiobe:[0,2,5,6],result:5,right:5,row:0,ru:2,save:5,search:1,seg_text:[1,3],select:5,sentenc:2,should:5,shuffle_s:[1,3],sibl:5,smaller:5,smatrix:[1,3],someth:5,space:0,srt:[5,6],submit:[0,5],submodul:[1,3],subsequ:5,suggest:[0,5],support:[2,5],tab:5,tabl:0,tend:5,term:2,testrun:0,text:[2,5],tf_type:[5,6],time:2,tmx:2,touch:5,translat:2,trim_df:[1,3],two:2,txt:[5,6],unless:5,upload:5,us:[0,1],usag:1,valu:5,version:0,wa:2,welcom:2,what:5,when:[2,5],willing:2,wrong:5,yet:0,you:[2,5],zh:2,zip:0},titles:["Examples","Welcome to radiobee\u2019s documentation!","Introduction","radiobee","radiobee package","How to use","\u4f7f\u7528\u8bf4\u660e"],titleterms:{"\u4f7f\u7528\u8bf4\u660e":6,align_s:4,align_text:4,amend_avec:4,app:4,cmat2tset:4,content:[1,4],docterm_scor:4,document:1,en2zh:4,en2zh_token:4,exampl:0,file2text:4,files2df:4,gen_aset:4,gen_eps_minsampl:4,gen_model:4,gen_pset:4,gen_row_align:4,how:5,indic:1,insert_spac:4,instal:0,interpolate_pset:4,introduct:2,limit:2,lists2cmat:4,loadtext:4,mdx_e2c:4,modul:4,motiv:2,packag:4,plot_cmat:4,plot_df:4,process_upload:4,radiobe:[1,3,4],s:1,seg_text:4,shuffle_s:4,smatrix:4,submodul:4,tabl:1,trim_df:4,us:5,usag:0,welcom:1}})
 
1
+ Search.setIndex({docnames:["examples","index","intro","modules","radiobee","userguide","userguide-zh"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["examples.rst","index.rst","intro.rst","modules.rst","radiobee.rst","userguide.rst","userguide-zh.rst"],objects:{},objnames:{},objtypes:{},terms:{"1":[5,6],"12":[5,6],"2":[5,6],"3":2,"316287378":[5,6],"4":[5,6],"8":[5,6],"\u4e00\u822c\u65e0\u9700\u7406\u4f1a\u8fd9\u4e9b\u53c2\u6570":6,"\u4e3a\u4e2d\u82f1\u6587\u6df7\u5408\u6587\u672c\u53ca\u8bd5\u7740\u5206\u79bb\u4e2d\u82f1\u6587":6,"\u4e3a\u7a7a\u767d\u65f6":6,"\u4e86\u89e3\u8fd9\u4e9b\u5bf9\u9f50\u5de5\u5177":6,"\u4ee5\u540e\u53ef\u80fd\u4f1a\u652f\u6301":6,"\u4f18\u8d28\u5bf9":6,"\u4f7f\u7528\u8bf4\u660e":1,"\u5219\u4f1a\u89c6":6,"\u53e6\u4e00\u65b9\u9762":6,"\u53ef\u4ee5\u4ee5\u540e\u4f1a\u652f\u6301":[],"\u53ef\u4ee5\u53f3\u51fb\u62f7\u51fa\u56fe\u7684\u94fe\u63a5\u7528\u6d4f\u89c8\u5668\u72ec\u7acb\u8bbf\u95ee\u62f7\u51fa\u6765\u7684\u94fe\u63a5\u6216\u53f3\u51fb\u5b58\u76d8\u518d\u7528\u770b\u56fe\u7a0b\u5e8f\u6253\u5f00\u5b58\u76d8\u7684\u56fe\u6587\u4ef6":6,"\u548c":6,"\u5acc\u56fe\u592a\u5c0f\u7684\u8bdd":6,"\u5b58\u4e0b\u6709\u5173\u53c2\u6570\u67e5\u770b\u6216\u901a\u77e5\u5f00\u53d1\u8005":6,"\u662f":6,"\u6700\u5c0f":6,"\u7136\u540e\u8fdb\u884c\u5bf9\u9f50":6,"\u7684\u5b6a\u751f\u5144\u5f1f":6,"\u7684\u5efa\u8bae\u503c":6,"\u76ee\u524d\u4ec5\u652f\u6301\u4e2d\u82f1":6,"\u76ee\u524d\u4ec5\u652f\u6301\u7eaf\u6587\u672c\u6587\u4ef6\u4e0a\u8f7d":6,"\u7b2c\u4e8c\u6b21\u4e0a\u8f7d\u6587\u4ef6\u524d\u8bf7\u70b9\u51fb":6,"\u7b49":6,"\u7b49\u683c\u5f0f":6,"\u82f1\u4e2d\u5bf9\u9f50":6,"\u8bbe\u5927\u4e9b\u5219\u4f1a\u5f97\u5230\u5c11\u4e00\u4e9b\u5bf9\u9f50\u5bf9\u56e0\u4e3a\u53ef\u80fd\u9519\u5931\u4e86\u4e00\u4e9b":6,"\u8bbe\u5927\u4e9b\u5219\u53ef\u80fd\u4f1a\u9519\u5931\u4e00\u4e9b":[],"\u8bbe\u5927\u4e9b\u6216":6,"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a":[],"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a\u6709\u66f4\u591a":6,"\u8bbe\u5c0f\u4e9b\u6216":6,"\u8bef\u62a5\u5bf9":6,"\u8bf7\u52a0\u5165qq\u7fa4":6,"\u8fd0\u884c\u51fa\u9519\u65f6\u53ef\u4ee5\u70b9\u51fb":6,"\u9519\u8bef\u5224\u65ad\u4e3a\u5bf9\u9f50\u7684\u5bf9":6,"\u9519\u8bef\u5bf9":[],"do":5,"new":5,As:0,For:0,If:[2,5],On:5,The:2,To:5,about:5,ad:2,address:5,aim:2,align:[0,2,5,6],align_s:[1,3],align_text:[1,3],also:5,although:2,amend_avec:[1,3],an:2,app:[1,3],applic:2,ar:[2,5],attempt:5,been:[0,2],befor:5,better:5,blank:5,browser:5,built:0,bumblebe:[5,6],can:5,candid:5,cannot:0,cat:2,chines:5,clear:[5,6],click:[0,5],cmat2tset:[1,3],co:0,contact:2,content:3,copi:5,csv:[5,6],current:2,de:2,develop:[2,5],dl_type:[5,6],docterm_scor:[1,3],docx:[5,6],download:0,dual:2,dualtext:2,e:2,ebook:2,educ:2,en2zh:[1,3],en2zh_token:[1,3],en:2,english:5,epsilon:[5,6],esp:[5,6],etc:[2,5],exampl:[1,2],fals:5,file2text:[1,3],file:[5,6],files2df:[1,3],find:2,first:5,flag:[5,6],format:5,full:2,further:2,g:2,gen_aset:[1,3],gen_eps_minsampl:[1,3],gen_model:[1,3],gen_pset:[1,3],gen_row_align:[1,3],go:5,good:5,gradio:2,group:5,ha:[0,2],hand:5,have:5,help:2,here:2,how:1,html:[5,6],http:0,huggingfac:0,identifi:5,idf_typ:[5,6],imag:5,implement:2,index:1,inform:5,insert_spac:[1,3],instal:1,interfac:2,interpolate_pset:[1,3],introduct:1,ja:2,join:5,just:0,know:5,languag:2,larger:5,later:5,learn:2,left:5,limit:1,lists2cmat:[1,3],loadtext:[1,3],look:5,machin:2,mai:5,md:[5,6],mdx_e2c:[1,3],method:0,mikee:0,min_sampl:[5,6],minimum:5,miss:5,mix:5,modul:[1,3],more:5,motiv:1,need:5,norm:[5,6],normal:5,now:0,one:0,onli:2,onlin:0,open:5,other:5,output:5,packag:[0,1,3],page:1,pair:[2,5],paragraph:2,particular:2,pdf:[5,6],permit:2,pip:0,pleas:5,plot_cmat:[1,3],plot_df:[1,3],posit:5,power:2,proced:5,process_upload:[1,3],properli:2,provid:2,publish:0,pure:5,pypi:0,python:2,qq:5,radiobe:[0,2,5,6],result:5,right:5,row:0,ru:2,save:5,search:1,seg_text:[1,3],select:5,sentenc:2,separ:5,should:5,shuffle_s:[1,3],sibl:5,smaller:5,smatrix:[1,3],someth:5,space:0,srt:[5,6],submit:[0,5],submodul:[1,3],subsequ:5,suggest:[0,5],support:[2,5],tab:5,tabl:0,tend:5,term:2,testrun:0,text:[2,5],tf_type:[5,6],them:5,time:2,tmx:2,touch:5,translat:2,treat:5,trim_df:[1,3],two:2,txt:[5,6],unless:5,upload:5,us:[0,1],usag:1,valu:5,version:0,wa:2,welcom:2,what:5,when:[2,5],willing:2,wrong:5,yet:0,you:[2,5],zh:2,zip:0},titles:["Examples","Welcome to radiobee\u2019s documentation!","Introduction","radiobee","radiobee package","How to use","\u4f7f\u7528\u8bf4\u660e"],titleterms:{"\u4f7f\u7528\u8bf4\u660e":6,align_s:4,align_text:4,amend_avec:4,app:4,cmat2tset:4,content:[1,4],docterm_scor:4,document:1,en2zh:4,en2zh_token:4,exampl:0,file2text:4,files2df:4,gen_aset:4,gen_eps_minsampl:4,gen_model:4,gen_pset:4,gen_row_align:4,how:5,indic:1,insert_spac:4,instal:0,interpolate_pset:4,introduct:2,limit:2,lists2cmat:4,loadtext:4,mdx_e2c:4,modul:4,motiv:2,packag:4,plot_cmat:4,plot_df:4,process_upload:4,radiobe:[1,3,4],s:1,seg_text:4,shuffle_s:4,smatrix:4,submodul:4,tabl:1,trim_df:4,us:5,usag:0,welcom:1}})
docs/build/html/userguide-zh.html CHANGED
@@ -76,6 +76,7 @@
76
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> 是 <code class="docutils literal notranslate"><span class="pre">bumblebee</span> <span class="pre">aligner</span></code> 的孪生兄弟。请加入qq群 <code class="docutils literal notranslate"><span class="pre">316287378</span></code> 了解这些对齐工具。</p></li>
77
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持中英、英中对齐。</p></li>
78
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> 等格式。</p></li>
 
79
  <li><p>第二次上载文件前请点击”Clear”。</p></li>
80
  <li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: 一般无需理会这些参数。</p></li>
81
  <li><p><code class="docutils literal notranslate"><span class="pre">esp</span></code> 和 <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> 的建议值 – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (最小 <code class="docutils literal notranslate"><span class="pre">epsilon</span></code>): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p>
 
76
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> 是 <code class="docutils literal notranslate"><span class="pre">bumblebee</span> <span class="pre">aligner</span></code> 的孪生兄弟。请加入qq群 <code class="docutils literal notranslate"><span class="pre">316287378</span></code> 了解这些对齐工具。</p></li>
77
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持中英、英中对齐。</p></li>
78
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 目前仅支持纯文本文件上载 (txt, md, csv 等)。 以后可能会支持 <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> 等格式。</p></li>
79
+ <li><p><code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">2</span></code> 为空白时,<code class="docutils literal notranslate"><span class="pre">radiobee</span></code> 则会视 <code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">1</span></code> 为中英文混合文本及试着分离中英文,然后进行对齐。</p></li>
80
  <li><p>第二次上载文件前请点击”Clear”。</p></li>
81
  <li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: 一般无需理会这些参数。</p></li>
82
  <li><p><code class="docutils literal notranslate"><span class="pre">esp</span></code> 和 <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> 的建议值 – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (最小 <code class="docutils literal notranslate"><span class="pre">epsilon</span></code>): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p>
docs/build/html/userguide.html CHANGED
@@ -75,6 +75,7 @@
75
  <ul class="simple">
76
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> is a sibling of <cite>bumblebee aligner</cite>. To know more about these aligners, please join qq group <cite>316287378</cite>.</p></li>
77
  <li><p>Uploaded files should be in pure text format (txt, md, csv etc). <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> etc may be supported later on.</p></li>
 
78
  <li><p>Click “Clear” first for subsequent submits when uploading files.</p></li>
79
  <li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: Normally there is no need to touch these unless you know what you are doing.</p></li>
80
  <li><p>Suggested <code class="docutils literal notranslate"><span class="pre">esp</span></code> and <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> values – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (minimum epsilon): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p></li>
 
75
  <ul class="simple">
76
  <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> is a sibling of <cite>bumblebee aligner</cite>. To know more about these aligners, please join qq group <cite>316287378</cite>.</p></li>
77
  <li><p>Uploaded files should be in pure text format (txt, md, csv etc). <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> etc may be supported later on.</p></li>
78
+ <li><p>If <code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">2</span></code> is left blank, <code class="docutils literal notranslate"><span class="pre">radiobee</span></code> will treat <code class="docutils literal notranslate"><span class="pre">file</span> <span class="pre">1</span></code> as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.</p></li>
79
  <li><p>Click “Clear” first for subsequent submits when uploading files.</p></li>
80
  <li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: Normally there is no need to touch these unless you know what you are doing.</p></li>
81
  <li><p>Suggested <code class="docutils literal notranslate"><span class="pre">esp</span></code> and <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> values – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (minimum epsilon): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p></li>
docs/source/examples.rst CHANGED
@@ -5,6 +5,6 @@ Examples
5
 
6
  Installation/Usage:
7
  *******************
8
- As the package has not been published on PyPi yet, it CANNOT be install using pip.
9
 
10
  For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
 
5
 
6
  Installation/Usage:
7
  *******************
8
+ As the package has not been published on PyPi yet, it CANNOT be installed using pip.
9
 
10
  For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
docs/source/userguide.rst CHANGED
@@ -4,7 +4,7 @@ How to use
4
  - ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
5
 
6
  - Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
7
- - If ``file 2`` is left blank, ``radibee`` will treat ``file 1`` as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.
8
 
9
  - Click "Clear" first for subsequent submits when uploading files.
10
  - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
 
4
  - ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
5
 
6
  - Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
7
+ - If ``file 2`` is left blank, ``radiobee`` will treat ``file 1`` as mixed English-Chinese text and attempt to separate English and Chinese texts before procedding to align them.
8
 
9
  - Click "Clear" first for subsequent submits when uploading files.
10
  - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
gradio_queue.db CHANGED
Binary files a/gradio_queue.db and b/gradio_queue.db differ
 
img/plt.png CHANGED
radiobee/__main__.py CHANGED
@@ -280,6 +280,7 @@ if __name__ == "__main__":
280
  out_file_dl,
281
  out_file_dl_excel,
282
  out_df_aligned,
 
283
  ]
284
  # outputs = ["dataframe", "plot", "plot"] # wont work
285
  # outputs = ["dataframe"]
 
280
  out_file_dl,
281
  out_file_dl_excel,
282
  out_df_aligned,
283
+ gr.outputs.HTML(),
284
  ]
285
  # outputs = ["dataframe", "plot", "plot"] # wont work
286
  # outputs = ["dataframe"]
radiobee/gradiobee.py CHANGED
@@ -28,6 +28,7 @@ from radiobee.text2lists import text2lists
28
 
29
  sns.set()
30
  sns.set_style("darkgrid")
 
31
 
32
  debug = False
33
  debug = True
@@ -313,6 +314,33 @@ def gradiobee(
313
  df_aligned = df_aligned[["text2", "text1", "likelihood"]]
314
  df_aligned.columns = ["text1", "text2", "likelihood"]
315
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
316
  # ===
317
  if plot_dia:
318
  output_plot = "img/plt.png"
@@ -330,6 +358,8 @@ def gradiobee(
330
  # return df_trimmed, plt, file_dl, file_dl_xlsx, df_aligned
331
 
332
  # output_plot: gr.outputs.Image(type="auto", label="...")
333
- return df_trimmed, output_plot, file_dl, file_dl_xlsx, df_aligned
 
 
334
 
335
  # modi outputs
 
28
 
29
  sns.set()
30
  sns.set_style("darkgrid")
31
+ pd.options.display.float_format = "{:,.2f}".format
32
 
33
  debug = False
34
  debug = True
 
314
  df_aligned = df_aligned[["text2", "text1", "likelihood"]]
315
  df_aligned.columns = ["text1", "text2", "likelihood"]
316
 
317
+ # round the last column to 2
318
+ # df_aligned.likelihood = df_aligned.likelihood.round(2)
319
+ # df_aligned = df_aligned.round({"likelihood": 2})
320
+
321
+ # df_aligned.likelihood = df_aligned.likelihood.apply(lambda x: np.nan if x in [""] else x)
322
+
323
+ # style
324
+ styled = df_aligned.style.set_properties(
325
+ **{
326
+ "font-size": "10pt",
327
+ "border-color": "black",
328
+ "border": "1px black solid !important"
329
+ }
330
+ # border-color="black",
331
+ ).set_table_styles([{
332
+ "selector": "", # noqs
333
+ "props": [("border", "2px black solid !important")]}] # noqs
334
+ ).format(
335
+ precision=2
336
+ )
337
+ # .bar(subset="likelihood", color="#5fba7d")
338
+
339
+ # .background_gradient("Greys")
340
+
341
+ # df_html = df_aligned.to_html()
342
+ df_html = styled.to_html()
343
+
344
  # ===
345
  if plot_dia:
346
  output_plot = "img/plt.png"
 
358
  # return df_trimmed, plt, file_dl, file_dl_xlsx, df_aligned
359
 
360
  # output_plot: gr.outputs.Image(type="auto", label="...")
361
+ # return df_trimmed, output_plot, file_dl, file_dl_xlsx, df_aligned
362
+ # return df_trimmed, output_plot, file_dl, file_dl_xlsx, styled, df_html # gradio cant handle style
363
+ return df_trimmed, output_plot, file_dl, file_dl_xlsx, df_aligned, df_html
364
 
365
  # modi outputs