TL;DR: These are the 3 major steps we took to reduce our memory footprint (0.2Gb instead of 4Gb) and the time to generate the reports (2m instead of 15m):

  • streamed the data directly to disk with xlsxtream;
  • switched the memory allocator to jemalloc; and
  • selected only what we needed, while avoiding ActiveRecord when pulling large amounts of data from the database

Our setup

Our main app used to be a rather big Rails monolith (235 Models, 220 Controllers, …) that was running on Heroku for the last few years.

Recently, we moved towards developing all new stuff as a smaller (micro) services alongside this monolith. But hosting these smaller applications on Heroku quickly became too expensive, so we began to slowly migrate them to AWS EKS (AWS Managed Kubernetes Service).

After a while only our monolith remained on Heroku, so we decided to move it too.

Since there are parts of our monolith that are not so majestic, we expected some problems. Our biggest concern was reporting and its memory usage.

Reporting history

Reporting was developed in the early days of BetterDoc. Everything started as a simple HTML table. When the need to export them to Excel arose, we added a simple button to download the CSV report. That turned out to be error prone, so we switched to generating .xlsx files instead using the axlsx gem. We also started to generate the reports in the background and upload them to our secure storage. Everything was working great until one day reports stopped coming.

We quickly realised that our Sidekiq jobs were lost. We were using Standard 2x dynos with 1GB of memory and it was not enough anymore to generate reports. We quickly scaled to Performance M dyno with 2.5GB of memory and everything was working again. Performance M dynos are 4 times more expensive than Standard 2x so we needed to fix our reports. We had other priorities so we just wrote a script that will auto scale worker dynos while reports are generated. This turned out to be good enough for the time being (we only needed bigger dyno for a few hours during the week), so we never came around to implement a real fix. 🥴

Migration to AWS

When we started the migration from Heroku it took almost 15 minutes to generate the biggest report. In this time the worker consumed more than 4GB of memory. We were scaling to Performance L dynos to make it work.

Pseudo code for the report was looking like this:

Axlsx::Package.new do |p|
  p.workbook.add_worksheet(name: "Some report") do |sheet|
    ActiveRecord::Base.connection.uncached do
      Inquiry.reporting.includes(:appointments, :iterations).find_each do |inquiry|
        sheet.add_row(values_for_report(inquiry))
      end
    end
  end

  p.serialize("report.xlsx")
end

Step 1: Stream to Excel

We were using find_each so that ActiveRecord objects got initialized in batches instead of all at once. But axlsx still built up the whole report in memory before writing it disk. To avoid this we switched to the xlsxtream gem which streams the rows to disk instead of writting them all at once. Luckily for us, the authors of xlsxtream were smart to keep the API very similar to axlsx so the required changes were minimal.

Xlsxtream::Workbook.open("report.xlsx") do |wb|
  wb.add_worksheet(name: "Some report") do |sheet|
    ActiveRecord::Base.connection.uncached do
      Inquiry.reporting.includes(:appointments, :iterations).find_each do |inquiry|
        sheet.add_row(valuse_for_report(inquiry))
      end
    end
  end
end

This change was enough to keep memory usage in a 2Gb - 2.5GB range and we were pretty confident that this should be enough to make it work on AWS. We migrated our monolith, tested everything on staging and everything worked fine. 👍

Step 2: Reduce memory fragmentation

Soon after deploy to production we got a ticket reported that some reports were not being generated. Our Sidekiq workers were killed and restarted cause they were reaching memory limits. Generating the reports kept our workers busy for a longer period of time, so they were using a high number of threads. We were already aware of ruby’s memory fragmentation issue when using multiple theads, so we decided to try jemalloc. In a few hours we had jemalloc running in production and reports were succesfully generated with memory usage below 1.5GB. 💪

Step 3: Optimize the data retrival

While working on first 2 fixes we also spent some time thinking if we could replace ActiveRecord. ActiveRecord is a feature rich ORM but it is not a good fit for pulling large amount of “raw” data from the database. We decided to try Sequel instead.

Using Sequel::Model we were able to reuse existing decorators and make the switch as simple as possible.

In the previous ActiveRecord-based version we also were fetching a lot of columns - due to eager loading some associations (Inquiry.reporting.includes(:appointments, :iterations)) - but only used a small subset of them for the actual report. As such we made sure to only load columns which were strictly needed for any given report.

With this the final (pseudo) code was now looking like this:

class Reports::Inquiry < Sequel::Model
  one_to_many :appointments, class: "Reports::Appointment"
  one_to_many :iterations, class: "Reports::Iteration"

  def decorate
    Inquiry::ReportDecorator.new(self)
  end
end

dataset = Reports::Inquiry
  .select([:only, :needed, :columns])
  .eager(
    appointments: proc { |ds| ds.select(:id, :inquiry_id, :scheduled_date) },
    iterations: proc { |ds| ds.select(:id, :inquiry_id, :data) }
  )
Xlsxtream::Workbook.open("report.xlsx") do |wb|
  wb.add_worksheet(name: "Some report") do |sheet|
    dataset.paged_each do |inquiry|
      sheet.add_row(valuse_for_report(inquiry))
    end
  end
end

When these changes were deployed, the memory usage of our production workers dropped to below 1GB. As a nice bonus our biggest report got generated in less than 2 minutes (from over 15 minutes before), and only consumed around 250MB of memory.

Conclusion

We should’ve done a lot of this when we first expirienced issues.

If you are generating big Excel files do it the right way from the beginning: stream the data and make sure to load only what you need.

Oh, and use jemalloc and Sequel, they are great!

Resources