Getting Interactive Brokers Statements into a Database

Published: 7 Feb 2023

Interactive Brokers provide activity statements in various formats, including PDF or CSV. That said, if you have a large number of transactions, it is better to insert all your IB statements into a database, which you can query as needed. It saves a lot of time. It automates logging of your trades. It makes performance analysis, accounting, and taxes much easier.

Below I introduce the rough approach how I do it. I will not go into too much detail of individual table schemas and scripts, just the big picture and a few useful hints. I use Python and MariaDB, but the logic is also valid for other programming languages and DBMS.

Downloading CSV statements from IB

First step is to download the statements from IB.

Go to your Account Management, Statements, choose Activity Statements.

Choose the CSV format, which is easiest to process.

IB offers them in various frequencies – daily, monthly, annual, or custom periods.

For my database I choose daily, as I want daily mark-to-market and valuations.

I also download monthly and annual, but only use some of the data from those (the transactions are obviously the same).

I download all the statements manually, which is fast, as the IB statement download interface has improved recently (previously the form did not remember the date you requested last, so you always had to click through the calendars when downloading statements from a more distant past – not it does and everything is quick).

IB CSV statement structure

IB Activity Statement CSV has a variable number of columns, but the first two are always the same.

The first column is statement section, which can have the following values (there may be more if you trade something different than I do):

  • Statement
  • Account Information
  • Net Asset Value
  • Change in NAV
  • Mark-to-Market Performance Summary
  • Realized & Unrealized Performance Summary
  • Month & Year to Date Performance Summary
  • Cash Report
  • Open Positions
  • Collateral for Customer Borrowing
  • Forex Balances
  • Trades
  • Interest Accruals
  • Financial Instrument Information
  • Base Currency Exchange Rate
  • Codes
  • Notes/Legal Notes

Each of these statement sections has different structure (different column count and meaning), but there are a few useful features:

  • Number and meaning of columns is consistent within the same section.
  • Rows belonging to same section are consecutive.
  • The first row of each section is header row.
  • Second column in all sections indicates row type. It can be either "Header" or "Data".

For further processing it is best to split the statement CSV file into individual section CSV files.

Splitting statement CSV into sections

A few hints (Python):

Read the CSV with open() readlines(). Something like pandas.read_csv() will not work due to the inconsistent structure between sections.

in_path = ...  # path to CSV file

with open(in_path, 'r') as f:
    rows = f.readlines()

Because the first column is always section name, which we need to allocate the row to the right section, we can split each row by the first comma into two parts:

  • Section name
  • Rest of row

Just beware double quotes. Some section names contain a comma and may be enclosed in double quotes.

The output of this process can be a dict with keys = section names and values = list of rows for that section. Then it is easy to create individual section CSVs from that.

Database tables

Most of the above listed statement sections have a corresponding table in my database. Therefore, the schema is mostly determined by the IB CSV statement sections and columns (with only minor changes in most tables).

There are five parts of my IB statements database:

  • Balance = balances at the end of statement period
  • Performance = position value changes, interest accruals, and other items affecting total profit/loss for the statement period
  • Statement = meta and aggreate information about each statement
  • Structure = meta and helper tables with information about the database structure
  • Transactions = trades, interest, fees, deposits and withdrawals during the statement period

Each part includes several tables, which are listed below.

Balance tables

  • bal_cash = "Forex Balances" CSV statement section
  • bal_complex_positions = "Complex Positions Summary"
  • bal_nav = "Net Asset Value"
  • bal_positions = "Open Positions"

Performance tables

  • per_cash_changes = "Cash Report"
  • per_interest_accruals = "Interest Accruals"
  • per_mtm = "Mark-to-Market Performance Summary"
  • per_nav_changes = "Change in NAV"
  • per_performance = "Realized & Unrealized Performance Summary"
  • per_ytd_performance = "Month & Year to Date Performance Summary"

Statement tables

Tables in this part contain meta and aggregate data for individual statements, such as exchange rates used to valuate non-base currency positions or total profit/loss and performance for the statement period.

  • smt_currency_rates = "Base Currency Exchange Rate"
  • smt_codes = "Codes"
  • smt_locations = "Locations"
  • smt_notes = "Notes-Legal Notes"
  • smt_statements = "Statement" + "Total P-L" + "Total Return for Statement Period"
  • smt_symbols = "Financial Instrument Information"

The table smt_symbols contains symbols and other specifications for any financial instrument that appears in the statements. I only insert each symbol the first time, from the first statement where it appears.

Structure tables

Tables in this part do not correspond to CSV statement sections. They are mainly for my own reference.

  • str_columns = dict of CSV statement section column names (values in the Header column) to database table column names
  • str_instrument_types = equity / futures / option / forex
  • str_sections = dict of CSV sections to database table names
  • str_statement_types = daily / monthly / yearly / custom range / special

The tables str_sections and str_columns are used by the update script to find the correct database table and column for every CSV statement section and column.

The table str_statement_types is used to give each statement a standardized name. For instance, all daily statements are named "Dyymmdd" (year, month, day), all monthly statements are named "Myymm" etc.

Transaction tables

  • trn_trades = "Trades"
  • trn_interest = "Interest"
  • trn_fees = "Fees"
  • trn_funding = "Deposits & Withdrawals"

Transactions are the same in all statements covering a particular trading day. Therefore, I only insert them from the daily statements to avoid duplicates.

By remaining on this website or using its content, you confirm that you have read and agree with the Terms of Use Agreement.

We are not liable for any damages resulting from using this website. Any information may be inaccurate or incomplete. See full Limitation of Liability.

Content may include affiliate links, which means we may earn commission if you buy on the linked website. See full Affiliate and Referral Disclosure.

We use cookies and similar technology to improve user experience and analyze traffic. See full Cookie Policy.

See also Privacy Policy on how we collect and handle user data.

© 2024 FinTut