Profile Information:
Name: Abishek Das
Nickname: Abishekcs
Github: Abishekcs
Email: abishekdascs@gmail.com
Location: India
Time Zone: India (UTC+5.30)
Meeting with mentors:
- Reachable anytime through slack, email, Zulip.
- UTC 11:30 AM to 8:30 PM (IST 5:00 PM – 2:00 AM)
Typical Working hours:
- UTC 7:30 AM to 3:30 PM (IST 1:00 PM – 9:00 PM)
Synopsis:
[Programs & Events Dashboard] Improve performance of slow routes
The Wikimedia Programs & Events Dashboard is a web application designed for the global Wikimedia community. It handles a high volume of web traffic, which can lead to a poor user experience. Users often experience slow latency, meaning there is a noticeable delay between their actions and the browser rendering a usable response. In some cases, this performance lag can even contribute to site downtime.
The project aims to fix the worst performance issues and Improve the user exprience - to make the site "feel" faster and reduce site down by:
- Improving backend efficiency: There are currently many indexes on the database tables that are not used efficiently, which leads to suboptimal performance.This includes issues like index cardinality problems, leadig to not using an index even if it will be useful for a search,like not using the entire composite index for searching, ordering or filtering.To identify these the slow query log feature on the Outreach Dashboard would be helpful, as it provides key information about the how the application queries data and investigate those slow SQL querys, which is essential for improving indexing.
Also, many of the Dashboard's most serious problems revolve around N+1s queries, which appears across the application and in background jobs.
Coming to caching based on the issues I've already worked on, I believe caching should be a last resort. If I can fix the SQL or optimize the view, I'll do that first. It's always better in the long run to fix the underlying problem rather than relying on caching.The reason is that well-optimized SQL queries can be often be just as fast as cache operations.Also,if we can get speed without adding something to the existing infrastrucure, we should do that instead.That said, if caching becomes necessary - especially for server-rendered views there's a good chance that "Russian Doll Caching" using fragment caching will be sufficient.Since, the Dashboard already uses Redis, with fragment caching redis will automatically evict old entries, so there will be little to no manual cache sweeping required.
- Improving Ruby Code in the Backend: After optimizing the database and cache, the next layer to improve is the Ruby code running within Rails application — where excessive object allocations or inefficient logic can cause performance issues.Ruby's object allocations and memory usage can have a significant impact on performance, especially under heavy load.Factors like the time spent in Ruby and the number of objects created (i.e time spent on memory allocation) can become a serious bottleneck.
My usual approach will be to profile the code locate slow sections, identify the root causes of the slowness, and rewrite the code to eliminate bottlenecks - then repeat the process until performance is improved.Since everything in Ruby is an object, programs often consume extra memory just to represent data.This high memory consumption is a major factor that can slow Ruby down.To optimize effectively, I will need to reduce the memory footprint. Doing so will reduce the time spent in garbage collection. The less memory the Dashboard's backend code uses, the less work Ruby's garbage collector has to do. Once I confirm that memory is being used optimally, I'll also analyze the algorithmic complexity fo the code as final step, ensuring that both memory and compute efficiency are aligned.
- Improving frontend behavior: To improve the user exprience and make the site feel faster, it's important to focus on frontend (i.e Redux + React) as well because front end performance is really web applications performance from the users perspective.This includes enhancing the current state management system and optimizing how API are made through the frontend.One of the key improvements will be intergrating RTK query for few API routes, particularly those that fetch large amounts of data or are used by multiple components across the application. This will help prevent redundant data fetching, reduce load on the backend, and ensure that components can efficiently share cached data without triggering multiple unnecessary requests.Additionally, with the recent introductions of Infinite Query in RTK Query, it's now possible to load data in chunks or pages, which helps prevent the frontend from choking on large payloads, especially in components that deal with extensive datasets like article lists and article finder.
On the React side, I plan to focus on reducing unnecessary re-renders, identifying memoization oppurtunities, and applying performance optimizations throughout the component tree.For instance, in the current implementation of the Details component, there's a slight input delay in all of the text area due to many state variables being held in the parent component rather than being localized in the child leading to re-render of all the child component when a text input changes which leads to noticable delay in input text area.Refactoring, this and other similar components will make interactions more responsive, prevent unnecessary re-renders and improve the overall performance of the UI.
This changes will greatly improve the performance of slow routes in Programs and Events Dashboard.
Mentors:
Sage Ross(@Ragesoss), formasit(@Formasit-chijoh)
Deliverables:
- Improve Backend Efficiency through query optimization, indexing, and strategic caching.
- Optimize Ruby Code in the Backend, focusing on Ruby performance optimization, memory usage, and object allocation.
- Enhance Frontend Behavior by improving React performance and integrating Redux Toolkit with RTK Query.
Implementation Details:
These are the tools I will be using to detect performance issues and slow routes in the Programs and Events Dashboard codebase:
- Rack-Mini-Profiler: Rack Mini Profiler is useful to profile database usage.It logs every SQL query that hit the database, showing how long each took and where it originated - including line number and parent calls.It's invalueable for figuring out where to apply eager loading properly.
- Ruby Profilers: To investigate time spent in Ruby code, I will be using ruby profilers such as stack prof and ruby-prof.stackprof will be very easy to use since it's built in rack-mini-profiler.
- Bullet: I will be using this gem to check when adding eager loading (N+1s queries) will be helpful and when we are using eager loading that isn't necessary and when we should use counter cache.
- Prospite: Prosopite is a modern alternative to Bullet for detecting N+1 queries.Reasons, why I will also be using this because it's able to detect N+1 queries with zero false positives or false negatives a common issue with Bullet.
- Query Count: Currently Rails 7.2 brings SQL queries count to template rendering logs.But since the dashboard is using Rails 7.0.7 it won't be possible.So, I will be using a Query Count gem that surfaces the number of queries that ran on a given page or endpoint.
- New Relic: It is already used by the dashboard.This will help in examining request queue time and server response time that are slow and helpful in finding (N+1s queries).Also, the New Relic's Ruby VM tab can give a lot of informations about the memory behavior of Rails applications, which can in turn point to different performance issues and opputunities for improvement.
- Eslint Plugin React Compiler: I will use this on the React side to identify unnecessary re-renders, spot memoization opportunities, and apply general performance optimizations. It helps enforce best practices and highlights areas in components where rendering efficiency can be improved.
Most of the major work for improving the slow routes in the Programs and Events Dashboard lies in the backend.For database optimization, I will be focusing on improving existing indexes and making sure they are fully utilized.Some key areas include adding indexes on columns that are frequently used in queries to speed up data retrieval, optimizing database queries to fetch only the required data, and reducing the number of database calls and finally making sure existing composite/mulitple index are fully utilized to their maximum capibility.
I will also work on optimizing ActiveRecord queries to make data access more efficient.For example, some the dashboard codebase often uses methods like count and exists?, which always execute a query and can lead to unncessary database roundtrips.
If needed, I will implement caching for frequently accessed data or costly operations.This will be done using fragment caching for specific parts of the views that are slow to render and where possible, I'll use Russian Doll caching to make it more efficient.I will also focus on optimizing background jobs and improving their performance.
Once the backend and database are optimized, I will move on to Ruby performance improvements, and finally shift focus to the frontend - specifically enhancing performance in React with Redux Toolkit (RTK) and RTK Query.
Due to this structure, I've divided the implmentation into three phases:
- Phase 1: Improving backend efficiency
- Phase 2: Improving backend efficiency + Ruby code optimization
- Phase 3: Improving frontend behavior + backend and Ruby optimization
Details of each phase are explained in the Timeline Section.
Timeline
Phase 1: June 2 – July 1 (Improving backend efficiency)
- Reduce the average reponse time of the following currently know enpoints which I discovered in my local development environment
- CoursesController#articles
- Surveys#results
- Survey_assignment#index
- Courses#manual_update
- Ores_plot#campaign_plot
- CampaignsController#ores_plot
- CampaignsController#programs
- CampaignsController#articles
- CampaignController#users
- RevisionFeedbackController#index
- Improve performance for all the slow queries shown by the slow query log feature on the outreachdashboard server.
- To reproduce this issues shown in productions locally I will be following this process:
- Locate problems in production metrics, through APM that is New Relic in case of dashboard.
- Replicate locally and profile the problem with Stackprof, rack-mini-profiler, Propsite, Query Count, etc to find what lines of code are causing the problem
- Create a benchmark around the problem areas.
- Iterate on a solution.
- Let the solution code run in production.
- Monitor the code
- Then optimize it further if required and determine if cache is needed.
- To reproduce this issues shown in productions locally I will be following this process:
- Many of the Dashboard's most serious problem revolves around N+1s.So, I will be fixing all of the obvious Major N+1s issues across the application and in background jobs too.
- In addition to improving the /surveys/results performance, I will also fix all Bullet warnings related to the /surveys endpoints, including:
- /surveys
- /surveys/rapidfire/question_groups
- /surveys/results
- /surveys/results/
- Most of these fixes involve straightforward changes such as adding or removing eager loading, based on Bullet's suggestions. However, Bullet also recommends using counter caches for some of these endpoints. Implementing this will require changes in the RapidFire fork used by the Dashboard. If adding counter caches results in a noticeable reduction in database queries, I plan to submit a fix to the RapidFire fork as well.
- Fix all valid Bullet warnings across the Dashboard. Before implementing any changes suggested by Bullet, I will cross-check whether the warnings are accurate (i.e., not false positives or false negatives) and ensure they provide at least a minimal performance improvement to the Dashboard.
- Fix all N+1 queries automatically detected by Prosopite. There were several cases where Bullet failed to detect N+1 queries for example, in CoursesController#articles — but Prosopite successfully identified them.These kind of performance detection by Prosopite will be addressed to improve database efficiency.
- New Relic
- Next, I will review the New Relic dashboard for controller actions and sort them by time consumed. This will help identify the top 10 most time-consuming controller actions, allowing me to focus my performance work where it will have the biggest impact on both user experience and backend performance.
- And sort New Relic's transaction list by "longest response time", and look at the list.Improve the slowest-on-average transactions which have really bad N+1s which might be causing tens of thousands of allocations.
Phase 2: July 2 – July 31 (Backend Efficiency + Ruby Code Optimization)
- Next I will try to improve performance through caching, keeping the following rule in mind before implementing it:
- Is the data user specific or frequently changing dynamic data? If yes, then caching is probably not a good idea.
- How often will the data change?
- Can I make some other optimizatios instead?
- First, optimize Dashboard queries.
- Second, optimize Dashboard views.
- Then, check if caching is still required.
- If caching is a must to improve performance for a particular point in the dashbord then I will I'm going to apply these rules:
- At what layer of the stack should the Cache be implemented?
- Where does the most complexity of applying caching increase the most?
- Under what conditions should the implemented cache be purged?
- How long should the cache take to expire?
- What kind of load/traffic does the dashboard have for that particular endpoint?
- Also, make sure existing caching has a cache hit of 95% or more.
- Optimize Active Record Queries
- Improve performance of slow ActiveRecord query that takes 500ms or more.
- Making sure count and exists? are only used where it is necessary since it always execute a query and can lead to unncessary database roundtrips.
- Making sure not using .all everywhere since in development it's doesnt seem to harmful.But in productions it might return 100,000 of rows which might also lead ot ruby memory issue.
- Avoid loading unnecessary columns with .select instead of using .all.Making sure selecting only those columns that are required.
- Send only required data through JSON to React
- Some of the JSON APIs in the Dashboard send data that is not used by React at all.For example CourseController#articles returns a large payload when accessed through the "Articles Edited" tab, but much of the JSON data is never actually used
- I will work on redesigning some of these API endpoints to send only the data that is needed, reducing payload size and improving performance.
- I will also specifically focus on the existing indexes in the database tables and check whether they are being fully utilized in the queries they support.If required, I will introduce new indexes to improve query peformance.To evaluate and optimize index usage, I will use the following MySQL query analysis features:
- ANALYZE
- ANALYZE FORMAT=JSON
- EXPLAIN
- EXPLAIN EXTENDED
- Improve background jobs that operate on large datasets. I have yet to dive deeply into the codebase for this part, but I’m confident there will be opportunities for improvement. The Dashboard’s background jobs often deal with large result sets in ActiveRecord, which can significantly increase Ruby’s memory usage and slow down processing.
- Ruby Code Optimization
- Optimize high memory consumption in Ruby code throughout the Dashboard to reduce garbage collection time and overall memory usage.
- To achieve this, I will focus on preventing unnecessary object allocations and memory leaks. Based on my research from various Ruby/Rails resources, most Rails applications create between 25,000 and 75,000 objects per transaction. While I’m not entirely sure how accurate these numbers are for this codebase, it's clear that excessive object allocation can negatively impact performance. If possible, I will try to speed up the application by either reducing the number of objects — doing the same work with fewer allocations — or by finding a completely different, more efficient approach where needed.
- Write more efficient iterators that consume less time and memory, such as using .map as demonstrated in PR #6279.
- When necessary, replace explicit iterators in views with render :collection, which is more memory- and time-efficient.
Phase 3: August 1 – August 29 (Improving frontend behavior + backend and Ruby optimization)
- Continue with more backend and Ruby optimization.
- Set up RTK and RTK Query. I already have a PR open for this (#6204). One of the current challenges is that using RTK Query adds some complexity, especially due to the existing folder structure. I will work on simplifying this setup, including reorganizing the folders, to avoid confusion with the old Redux structure. This way, during the internship, I won’t have to worry about where files or folders should go, and can focus on building efficiently.
- Fix all performance issues in React detected by eslint-plugin-react-compiler.
- One of the main areas I want to focus on is identifying whether frontend improvements (React + Redux with RTK Query) can further optimize endpoints that have already been improved on the backend. For example, I currently have a PR (#6287) that significantly improves the performance of the CoursesController#articles endpoint. However, I believe that with additional frontend optimization — specifically by using RTK Query and its infinite query capabilities — performance can be improved even further. I will follow this same pattern across other components to evaluate whether their frontend counterparts can be enhanced in a similar way.
- Implement Prefetching for Course Tabs
- To further enhance the user experience on course pages, I will implement prefetching for the JSON data used in other tabs (e.g., Articles, Students, Revisions, Activity). This will involve preloading relevant data while the user is on the main course page, so that when they click on a tab, the content appears instantly with zero perceived latency. Prefetching can be achieved using resource hints or RTK Query’s built-in prefetch method. This is a low-cost change with a meaningful impact on responsiveness.
- Complete final enhancements to the backend and optimize Ruby code for improved performance.
- Additionally, I will write detailed documentation explaining all the technical changes I made, along with the reasons behind each change, clearly outlining every step taken.
Participation
For communication, I primarily intend to use Slack, as it is the most convenient for real-time discussions. I will also be available via email and Zulip, as previously mentioned. I have already been in contact with my mentors Sage Ross and formasit prior to the start of the application period. Additionally, I am regularly active on GitHub and Slack, so I can promptly respond to any issues or discussions there. Throughout the coding period, I will be reachable on GitHub for any pull requests, issues, or feedback related to my contributions.
About Me
- Education(Completed)
- University: Nagaland University
- Degree: Computer Science and Engineering
- How did you hear about this program?
- I heard about this program through youtube.
- Will you have any other time commitments, such as school work, another job, planned vacation, etc, during the duration of the program?
- Fully available throughout Outreachy 2025 with no planned vacations or employment conflicts. My most productive hours are 8:00 AM - 5:00PM IST on weekdays, with flexibility on weekends.
- We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?
- No, I have not applied for GSOC, I have only applied for Outreachy 2025 with the wikimedia organization
- What does making this project happen mean to you?
- I began contributing to the Dashboard in late 2023 with the goal of applying my theoretical knowledge of computer science to real-world problems. Until then, most of my experience was academic, so I wanted to challenge myself by working on a live, user-facing project. Since then, I’ve had the opportunity to interact with past and current Dashboard interns, and I’ve learned a great deal from them especially from my mentor Sage Ross.One of the most rewarding moments has been hearing that some of my contributions bug fixes or new features have genuinely helped users of the Dashboard. At some point, solving bugs and tackling issues around the Dashboard started to feel like playing a video game I found myself really enjoying it, and I still do. Working on this project over the summer would feel like my way of giving back to the Dashboard, to the community, and especially to Sage and the past interns who’ve helped me grow. It would also allow me to contribute to some of the major issues I’ve been wanted to work on, and help take the Dashboard one step further in its development.
Past Experience
I have earned a great amount of experience with the codebase of the project over the past year and I’ve contributed to the dashboard over the past year or so.All of my contributions can be found here. Below I list some contributions which I made recently.
- #6279(merged): Optimize Survey Recent Responses fetching to prevent N+1 queries
- #6287(open): Optimize data loading for list of Articles edited in a Campaign, shown in Articles Edited tab
- #6274(open): Optimize batching for updating outdated average views
- #6204(open): Migrate Admin Notes CRUD from Redux to RTK Query & Set Up React-Redux Integration Testing
Flow Charts:
I have prepared two flowcharts — one for the backend and one for the frontend — which outline core strategies used to identify and resolve performance bottlenecks in the application, as shown below.



