guenthermi's picture
Update README.md
8f78b91 verified
---
license: mit
language:
- en
tags:
- schema
- word-embeddings
- embeddings
- unsupervised-learning
- tables
- web-table
- schema-data
---
# Pre-trained Web Table Embeddings
The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data.
The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings)
## Quick Start
You can install the table_embeddings package to encode text from tables by running the following commands:
```bash
pip install cython
pip install git+https://github.com/guenthermi/table-embeddings.git
```
After that you can encode text with the following Python snippet:
```python
from table_embeddings import TableEmbeddingModel
model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_combo150')
embedding = model.get_header_vector('headline')
```
## Model Types
| Model Type | Description | Download-Links |
| ---------- | ----------- | -------------- |
| W-tax | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150))
| W-row | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150))
| W-combo | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150))
| W-plain | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150))
## More Information
For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings)
More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892)
```
@inproceedings{gunther2021pre,
title={Pre-Trained Web Table Embeddings for Table Discovery},
author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang},
booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management},
pages={24--31},
year={2021}
}
```