A Python package for working with SEC filings at scale. Full Documentation | Website
- datamule-data Contains datasets for use with datamule-python
- datamule-indicators Create economic indicators from SEC filings
- txt2dataset Create datasets from unstructured text
- secsgml Parse SEC filings in SGML format
- doc2dict Convert documents to dictionaries. Not ready for public use.
- Download SEC filings quickly and efficiently
- Monitor EDGAR for new filings in real-time
- Parse filings at scale
- Access comprehensive datasets (10-Ks, SIC codes, etc.)
pip install datamule
from datamule import Portfolio
# Create a Portfolio object
portfolio = Portfolio('output_dir') # can be an existing directory or a new one
# Download submissions
portfolio.download_submissions(
filing_date=('2023-01-01','2023-01-03'),
submission_type=['10-K']
)
# Iterate through documents by document type
for ten_k in portfolio.document_type('10-K'):
ten_k.parse()
print(ten_k.data['document']['part2']['item7'])
# Iterate through documents by what strings they contain
for document in portfolio.contains_string('United States'):
print(document.path)
# You can also use regex patterns
for document in portfolio.contains_string(r'(?i)covid-19'):
print(document.type)
# For faster operations, you can take advantage of built in threading with callback function
def callback(submission):
print(submission.path)
submission_results = portfolio.process_submissions(callback)
from datamule import Sheet
sheet = Sheet('apple')
sheet.download_xbrl(ticker='AAPL')
from datamule import Index
index = Index()
results = index.search_submissions(
text_query='tariff NOT canada',
submission_type="10-K",
start_date="2023-01-01",
end_date="2023-01-31",
quiet=False,
requests_per_second=3)
Create a discord bot, use insider trading disclosures to map relationships in Silicon Valley, and more in examples.
Default is the SEC, but for faster downloads you can use datamule.
from datamule import Config
config = Config()
config.set_default_source("datamule") # set default source to datamule, can also be "sec"
print(f"Default source: {config.get_default_source()}")
To use datamule as a provider, you need an API key.
- How to host the SEC Archive for $20/month
- Creating Structured Datasets from SEC filings
- Deploy a Financial Chatbot in 5 Minutes