Spreadsheets, predominantly Microsoft Excel files, are ubiquitous across businesses of all sizes and industries. They serve as critical tools for data analysis, financial reporting, decision-making, and much more. However, despite their importance and extensive use, few organizations accurately grasp the sheer number of spreadsheets within their environment. This knowledge gap often leads to significant challenges in data governance, compliance, and security.

In this article, we aim to shed light on how many spreadsheets typically exist in an enterprise, identify their common storage locations, discuss the reasons behind their proliferation, and outline strategies for effectively managing them.

How Many Spreadsheets are there in an Enterprise?

The number of spreadsheets within an enterprise is often surprisingly large. For example, a 2021 Varonis report found that a medium-sized financial services company (500-1,500 employees) could have as many as 75 million files. A sizeable ratio of these are likely spreadsheets. Apparity, an EUC risk management software provider, suggests that a typical Fortune 500 company may manage millions of spreadsheets worldwide.

Further supporting these figures, Cimcon suggests that on average, each employee maintains around 3,000 End-User Computing (EUC) files, with spreadsheets making up approximately 90% of these files within financial organizations. Similarly, Mitratech claims that spreadsheets represent between 70-80% of all EUC files in enterprises.

Additional research underscores this prevalence. Thorne and Hancock’s paper documented a scenario where a company of approximately 1,000 employees hosted about 228,704 spreadsheets on shared drives alone. A paper by Panko (2013) highlighted a global bank managing between 8 and 10 million spreadsheets, and a government agency maintaining around 630,000 spreadsheets. A 2021 Planview article cites a global professional services firm discovering more than a million spreadsheets—translating to over 1,000 spreadsheets per employee. A case study from Cimcon  reported that a top-ten North American bank identified 10 million EUC files during a routine inventory. A 2017 paper by Smith, Middleton, and Kraft refers to a single shared drive with over 300,000 spreadsheets at ABB.

Where are These Spreadsheets Stored?

Enterprise spreadsheets reside across various IT environments:

  • Employee Devices: Individual desktops and laptops, particularly within local folders and personal document repositories, are common storage points. This practice has become even more widespread with the rise of remote work according to an Egnyte
  • Shared Network Drives and File Servers: Traditionally, the bulk of Excel files in a company have been on on-premises shared drives (departmental file shares, Windows file servers, NAS devices, etc.). Many companies still have decades of spreadsheets in shared folders accessible to teams.
  • Cloud Storage and Collaboration Platforms: Digital transformation has accelerated the migration of spreadsheets to cloud platforms such as SharePoint/OneDrive, Google Drive, and Box. A survey by Egnyte highlighted that enterprises commonly use about 14 different content repositories (email, messaging apps, cloud drives, etc.) to store and share their files, exacerbating data fragmentation.

Why Are There So Many Spreadsheets?

Spreadsheet proliferation primarily results from uncontrolled distribution practices, particularly through email or messaging platforms. Users frequently copy these files to personal or local directories, creating numerous divergent versions across multiple platforms. Examples from surveys and user anecdotes on platforms like Reddit and LinkedIn consistently illustrate this scenario:

A reddit user commented, “I've seen CFOs and COOs folders filled with 20 different versions of the same Excel file.” Another Reddit user described the situation where “users emailing spreadsheets around and someone with a full time job collating them.” Yet another user refers to “An Excel spreadsheet that gets emailed around and referenced in executive-level meetings for approval.”

A Sage article highlights a common scenario in budgeting processes, where spreadsheet templates are circulated via email among various departments. Each department head iteratively updates their budget data, resulting in numerous versions. This practice makes consolidation especially difficult, as finance personnel must sift through multiple versions to find the most current data.

On LinkedIn, a user commented, “Emailing spreadsheets back and forth and merging them is a nightmare for large data volumes” specifically referencing challenges faced in data collection for industrial plants.

A high profile Covid 19 inquiry in the UK reported multiple individuals rapidly exchanging spreadsheets, each containing separate pieces of pandemic-related data. This led to a frantic manual consolidation effort required daily to produce briefings.

In their 2017 paper, Smith, Middleton, and Kraft refers to a 180 person survey conducted at ABB. Among their respondents, sharing spreadsheet reports is indicated to be a common practice. Despite the availability of tools like Qlik and Tableau, spreadsheets are often used as a reporting tool. 75% of respondents share parts or entire reporting spreadsheets. 74% of respondents typically have two or more versions of a spreadsheet.

In a survey conducted by Hermans and Pinzger at a Dutch asset management company with 1,600 employees worldwide, 85% of participants indicated they shared spreadsheets with a colleague.

These references clearly demonstrate how sharing spreadsheets through emails and other platforms contributes significantly to their uncontrolled proliferation. Such practices inevitably result in confusion, data discrepancies, and errors. Simply examining Excel attachments within a typical business user's email inbox can reveal hundreds or even thousands of spreadsheets, further underscoring the widespread nature of this issue across organizations.

Strategies to Control Spreadsheet Growth

The proliferation of spreadsheets leads to numerous organizational issues, notably in areas such as security, data management, and compliance, as discussed extensively in my previous article, “Spreadsheet Sprawl – Uncontrolled Proliferation of Spreadsheets across the Organization: Challenges, Risks, and Solutions.”

Effectively addressing these risks involves significantly reducing or altogether eliminating practices that distribute multiple copies of spreadsheets. However, this must be achieved without compromising the flexibility and analytical power spreadsheets provide, which remain indispensable to business users.

The optimal solution is enabling users to continue leveraging spreadsheets for their intended purposes while converting their distribution into web-based applications. The underlying issue is not the spreadsheet itself, but rather the uncontrolled dissemination of spreadsheets, resulting in numerous duplicates and divergent versions. Transforming spreadsheets into web applications directly addresses this concern. Instead of sharing spreadsheet files via email, organizations can distribute links to these applications. Users can interact with these web-based applications in real-time without having the ability to download or save local copies as spreadsheets.

This approach empowers solution owners to maintain a single, centralized spreadsheet, simplifying maintenance and ensuring consistent data across the user base while significantly reducing spreadsheet proliferation.