SetFit with ppsingh/TAPP-multilabel-mpnet
This is a SetFit model that can be used for Text Classification. This SetFit model uses ppsingh/TAPP-multilabel-mpnet as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
| Label |
Examples |
| NEGATIVE |
- '(p 70-1).Antigua and Barbuda’s 2021 update to the first Nationally Determined Contribution the most vulnerable in society have been predominantly focused on adaptation measures like building resilience to flooding and hurricanes. The updated NDC ambition provides an opportunity to focus more intently on enabling access to energy efficiency and renewable energy for the most vulnerable, particularly women who are most affected when electricity is not available since the grid is down after an extreme weather event. Nationally, Antigua and Barbuda intends to utilize the SIRF Fund as a mechanism primarily to catalyse and leverage investment in the transition for NGOs, MSMEs and informal sectors that normally cannot access traditional local commercial financing due to perceived high risks.'
- 'The transport system cost will be increased by 16.2% compared to the BAU level. Electric trucks and electric pick-ups will account for the highest share of investment followed by electric buses and trucks. In the manufacturing industries, the energy efficiency improvement in the heating and the motor systems and the deployment of CCS require the highest investment in the non-metallic and the chemical industries in 2050. The manufacturing industries system cost will be increased by 15.3% compared to the BAU level.'
- 'Figure 1-9: Total GHG emissions by sector (excluding LULUCF) 2000 and 2016 1.2.2 Greenhouse Gas Emission by Sector • Energy Total direct GHG emissions from the Energy sector in 2016 were estimated to be 253,895.61 eq. The majority of GHG emissions in the Energy sector were generated by fuel combustion, consisting mostly of grid-connected electricity and heat production at around eq (42.84%). GHG emissions from Transport, Manufacturing Industries and Construction, and other sectors were 68,260.17 GgCO2 eq eq (6.10%), respectively. Fugitive Emissions from fuel eq or a little over 4.33% of total GHG emissions from the Energy sector. Details of GHG emissions in the Energy sector by gas type and source in 2016 are presented in Figure 1-10. Source: Thailand Third Biennial Update Report, UNFCCC 2020.'
|
| TARGET |
- 'DNPM, NFA,. Cocoa. Board,. Spice Board,. Provincial. gov-ernments. in the. Momase. region. Ongoing -. 2025. 340. European Union. Support committed. Priority Sector: Health. By 2030, 100% of the population benefit from introduced health measures to respond to malaria and other climate-sensitive diseases in PNG. Action or Activity. Indicator. Status. Lead. Implementing. Agencies. Supporting. Agencies. Time Frame. Budget (USD). Funding Source. (Existing/Potential). Other Support. Improve vector control. measures, with a priority. of all households having. access to a long-lasting. insecticidal net (LLIN).'
- 'Conditionality: With national effort it is intended to increase the attention to vulnerable groups in case of disasters and/or emergencies up to 50% of the target and 100% of the target with international cooperation. Description: In this goal, it is projected to increase coverage from 33% to 50% (211,000 families) of agricultural insurance in attention to the number of families, whose crops were affected by various adverse weather events (flood, drought, frost, hailstorm, among others), in addition to the implementation of comprehensive actions for risk management and adaptation to Climate Change.'
- 'By 2030, upgrade watershed health and vitality in at least 20 districts to a higher condition category. By 2030, create an inventory of wetlands in Nepal and sustainably manage vulnerable wetlands. By 2025, enhance the sink capacity of the landuse sector by instituting the Forest Development Fund (FDF) for compensation of plantations and forest restoration. Increase growing stock including Mean Annual Increment in Tarai, Hills and Mountains. Afforest/reforest viable public and private lands, including agroforestry.'
|
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("ppsingh/iki_target_setfit")
preds = model("In the oil sector, the country has benefited from 372 million dollars for the reduction of gas flaring at the initiative (GGFR - \"Global Gas Flaring Reduction\") of the World Bank after having adopted in November 2015 a national reduction plan flaring and associated gas upgrading. In the electricity sector, the NDC highlights the development of hydroelectricity which should make it possible to cover 80% of production in 2025, the remaining 20% ​​being covered by gas and other renewable energies.")
Training Details
Training Set Metrics
| Training set |
Min |
Median |
Max |
| Word count |
58 |
116.6632 |
508 |
| Label |
Training Sample Count |
| NEGATIVE |
51 |
| TARGET |
44 |
Training Hyperparameters
- batch_size: (8, 2)
- num_epochs: (1, 0)
- max_steps: -1
- sampling_strategy: undersampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.0018 |
1 |
0.3343 |
- |
| 0.1783 |
100 |
0.0026 |
0.1965 |
| 0.3565 |
200 |
0.0001 |
0.1995 |
| 0.5348 |
300 |
0.0001 |
0.2105 |
| 0.7130 |
400 |
0.0001 |
0.2153 |
| 0.8913 |
500 |
0.0 |
0.1927 |
Training Results Classifier
- Classes Representation in Test Data: Target: 9, Negative: 8
- F1-score: 87.8%
- Accuracy: 88.2%
Environmental Impact
Carbon emissions were measured using CodeCarbon.
- Carbon Emitted: 0.006 kg of CO2
- Hours Used: 0.185 hours
Training Hardware
- On Cloud: No
- GPU Model: 1 x Tesla T4
- CPU Model: Intel(R) Xeon(R) CPU @ 2.00GHz
- RAM Size: 12.67 GB
Framework Versions
- Python: 3.10.12
- SetFit: 1.0.3
- Sentence Transformers: 2.3.1
- Transformers: 4.35.2
- PyTorch: 2.1.0+cu121
- Datasets: 2.3.0
- Tokenizers: 0.15.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}