Missing SDG 17 in the output
Hello!
Is there any reason why SDG 17 is missing in the output.
How can we get this updated and appear?
Hello Andrew,
I am sorry but the OSDG labelled dataset I used did not include SDG17. I understand that the database has been updated. I will check to see if it now includes SDG17. I will update the model and let you know if SDG17 in now available in the database.
Regards
Sadick
Hello Andrew,
I just checked the updated dataset used in training the model, and unfortunately, "SDG17-Partnerships for the Goals" is not covered. SDG17 is a hard one given that it focuses on strengthening the means of implementation and revitalising the global partnership for sustainable development.
If you have data that aligns with SDG17, I'm happy to work with you to get it integrated into the dataset and update the model to automate its classification.
Regards
Sadick
Thanks
@sadickam
!
Great effort for checking this for me. I'll try to find a dataset containing the SDG17 and get back to you.
Hi All,
FYI: there is a global SDG Classification expert group. they meet every 3 months. see https://sdg-ai.org/
- SDG 17 labeled research papers can be found here: https://zenodo.org/doi/10.5281/zenodo.5205672
- To use it for training: extra step needed to fetch the abstracts from the doi's , use https://docs.openalex.org/api-entities/works (due to licencing we were not allowed to publish the abstracts . So they need to be fetched where these is a CC0 licenence on the abstract of the publisher. Using titles tot trian the model results in bad results. Abstracts ans full texts better, but not always available)
- Our trained models can be found here: https://zenodo.org/doi/10.5281/zenodo.7304546 (for visualisations: 1 faster multi label model for 17 sdg's)
- And here: https://doi.org/10.5281/zenodo.5835849 (for data analysis: for more accuracy 17 models 1 for each sdg)
- Report on data labeting, training and evaluation: https://zenodo.org/doi/10.5281/zenodo.5603019
enjoy! - more about the project here: https://aurora-universities.eu/sdg-research/
If there is a possibility you could help me to put these trained models on Hugging Face, and re-use that nice UI wrapper you build, that would be amazing. ( I have no clue where even to start. I put some stuff here; https://huggingface.co/MauriceV2021/AuroraSDGsModel )
Hello @MauriceV2021 ,
Thank you for the clarification and additional information. The training completely excluded the SDG17. We only the data that was made publicly available.
I will check out the link to the multi language bert model and report you provided.
Thank you for enriching the conversation and providing pointers to interesting information about SDG classification expert group.
Kind Regards
Sadick
Hello @MauriceV2021 ,
About Hugging Face and UI, I will be happy to work with you to make things happen. Please get in touch at s.sadick@deakin.edu.au and discuss and get the action going.
Looking forward to hearing from you.
Regards
Sadick
Hello guys!
@MauriceV2021
@sadickam
Thanks for the great feedback! I'm really keen how is it going?
Any help I can support for extending model with SDG17?
@sadickam
we have another SDG aiming to be added NON-official SDG18 - "Cultural heritage and preservation"
What is the general process of re-training this model? How much relevant data we need for training?
Hello @AndrewPolyloop ,
I have looked at the data @MauriceV2021 referenced in his post. I will retrieve some of that data and update the model to include SDG17. My new school semester has just started so a bit tight now. In about a week I would have a bit of time to get on to this.
I intent to work with @MauriceV2021 to make his model easily accessible on HF so you can also label from his repository.
Regarding NON-official SDG18, that can be easily added to model by updating the training data to include text labelled SDG18. The OSDG data I used has about 44000 rows of data and each SDG has approximately 2000 or more training examples. I am not how data you have on your SDG18; however, I say that if you have about 2000 or paragraphs or abstracts about 200 words long or more for each, that would work.
You can do with less training examples but the reliability of the class may be low. I am happy to discuss more on this with you and assist in get it integrated. However, given that SDG18 is not official, it will be best to develop a new model to include. Happy to help you accomplish this. We can discuss this via email or zoom to get it operationalized.
Regards
Sadick
Hello @sadickam !
Do you have any update on the 17th SDG adding attempt? Do you need any help with it?
Thanks,
Andrew
Hello @AndrewPolyloop ,
Sorry I have been on compassionate leave due to the passing of my Dad. I will get back to you as soon as I can and let you how may be able to help with the SDG17.
Regards
Sadick
@sadickam sad to hear that :( Take care with your family.