Spaces:
Runtime error
Runtime error
I've been working through the first two lessons of | |
[the fastai course](https://course.fast.ai/). For lesson one I trained a model | |
to recognise my cat, Mr Blupus. For lesson two the emphasis is on getting those | |
models out in the world as some kind of demo or application. | |
[Gradio](https://gradio.app) and | |
[Huggingface Spaces](https://huggingface.co/spaces) makes it super easy to get a | |
prototype of your model on the internet. | |
This model has an accuracy of ~96% on the validation dataset. | |
## The Dataset | |
I downloaded a few thousand publicly-available FOIA documents from a government | |
website. I split the PDFs up into individual `.jpg` files and then used | |
[Prodigy](https://prodi.gy/) to annotate the data. (This process was described | |
in | |
[a blogpost written last year](https://mlops.systems/fastai/redactionmodel/computervision/datalabelling/2021/09/06/redaction-classification-chapter-2.html).) | |
## Training the model | |
I trained the model with fastai's flexible `vision_learner`, fine-tuning | |
`resnet18` which was both smaller than `resnet34` (no surprises there) and less | |
liable to early overfitting. I trained the model for 10 epochs. | |
## Further Reading | |
This initial dataset spurred an ongoing interest in the domain and I've since | |
been working on the problem of object detection, i.e. identifying exactly which | |
parts of the image contain redactions. | |
Some of the key blogs I've written about this project: | |
- How to annotate data for an object detection problem with Prodigy | |
([link](https://mlops.systems/redactionmodel/computervision/datalabelling/2021/11/29/prodigy-object-detection-training.html)) | |
- How to create synthetic images to supplement a small dataset | |
([link](https://mlops.systems/redactionmodel/computervision/python/tools/2022/02/10/synthetic-image-data.html)) | |
- How to use error analysis and visual tools like FiftyOne to improve model | |
performance | |
([link](https://mlops.systems/redactionmodel/computervision/tools/debugging/jupyter/2022/03/12/fiftyone-computervision.html)) | |
- Creating more synthetic data focused on the tasks my model finds hard | |
([link](https://mlops.systems/tools/redactionmodel/computervision/2022/04/06/synthetic-data-results.html)) | |
- Data validation for object detection / computer vision (a three part series β | |
[part 1](https://mlops.systems/tools/redactionmodel/computervision/datavalidation/2022/04/19/data-validation-great-expectations-part-1.html), | |
[part 2](https://mlops.systems/tools/redactionmodel/computervision/datavalidation/2022/04/26/data-validation-great-expectations-part-2.html), | |
[part 3](https://mlops.systems/tools/redactionmodel/computervision/datavalidation/2022/04/28/data-validation-great-expectations-part-3.html)) | |