benchmarks/MiniExcel.Benchmarks | ||
docs | ||
drafts | ||
samples | ||
src | ||
tests | ||
.gitattributes | ||
.gitignore | ||
appveyor.yml | ||
LICENSE.md | ||
README.md | ||
README.zh-CN.md | ||
README.zh-Hant.md |
Introduction
MiniExcel is simple and efficient to avoid OOM's .NET processing Excel tool.
At present, most popular frameworks need to load all the data into the memory to facilitate operation, but it will cause memory consumption problems. MiniExcel tries to use algorithm from a stream to reduce the original 1000 MB occupation to a few MB to avoid OOM(out of memory).
Features
- Low memory consumption, avoid OOM (out of memory) and full GC
- Support
real-time
operation of each row of data - Support LINQ deferred execution, it can do low-consumption, fast paging and other complex queries
- Lightweight, does not with any third-party dependencies, DLL is less than 100KB
- Easy API style to read/write/fill excel
Get Started
- Excel Query
- Create Excel
- Fill Data To Excel Template
- Excel Column Name/Index/Ignore Attribute
- Examples
Demo
- LINQPad : Download Basic Demo.linq
Installation
You can install the package from NuGet
Release Notes
Please Check Release Notes
TODO
Please Check TODO
Performance
Test1,000,000x10.xlsx as performance test basic file,A total of 10,000,000 "HelloWorld" with a file size of 23 MB
Benchmarks logic can be found in MiniExcel.Benchmarks , and test cli
dotnet run -p .\benchmarks\MiniExcel.Benchmarks\ -c Release -f netcoreapp3.1 -- -f * --join
Output from the latest run is :
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
[Host] : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT
Job-ZYYABG : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT
IterationCount=3 LaunchCount=3 WarmupCount=3
Method | Max Memory Usage | Mean | Gen 0 | Gen 1 | Gen 2 |
---|---|---|---|---|---|
'MiniExcel QueryFirst' | 0.109 MB | 726.4 us | - | - | - |
'ExcelDataReader QueryFirst' | 15.24 MB | 10,664,238.2 us | 566000.0000 | 1000.0000 | - |
'MiniExcel Query' | 17.3 MB | 14,179,334.8 us | 367000.0000 | 96000.0000 | 7000.0000 |
'ExcelDataReader Query' | 17.3 MB | 22,565,088.7 us | 1210000.0000 | 2000.0000 | - |
'Epplus QueryFirst' | 1,452 MB | 18,198,015.4 us | 535000.0000 | 132000.0000 | 9000.0000 |
'Epplus Query' | 1,451 MB | 23,647,471.1 us | 1451000.0000 | 133000.0000 | 9000.0000 |
'OpenXmlSDK Query' | 1,412 MB | 52,003,270.1 us | 978000.0000 | 353000.0000 | 11000.0000 |
'OpenXmlSDK QueryFirst' | 1,413 MB | 52,348,659.1 us | 978000.0000 | 353000.0000 | 11000.0000 |
'ClosedXml QueryFirst' | 2,158 MB | 66,188,979.6 us | 2156000.0000 | 575000.0000 | 9000.0000 |
'ClosedXml Query' | 2,184 MB | 191,434,126.6 us | 2165000.0000 | 577000.0000 | 10000.0000 |
Method | Max Memory Usage | Mean | Gen 0 | Gen 1 | Gen 2 |
---|---|---|---|---|---|
'MiniExcel Create Xlsx' | 15 MB | 11,531,819.8 us | 1020000.0000 | - | - |
'Epplus Create Xlsx' | 1,204 MB | 22,509,717.7 us | 1370000.0000 | 60000.0000 | 30000.0000 |
'OpenXmlSdk Create Xlsx' | 2,621 MB | 42,473,998.9 us | 1370000.0000 | 460000.0000 | 50000.0000 |
'ClosedXml Create Xlsx' | 7,141 MB | 140,939,928.6 us | 5520000.0000 | 1500000.0000 | 80000.0000 |
Excel Query
1. Execute a query and map the results to a strongly typed IEnumerable [Try it]
Recommand to use Stream.Query because of better efficiency.
public class UserAccount
{
public Guid ID { get; set; }
public string Name { get; set; }
public DateTime BoD { get; set; }
public int Age { get; set; }
public bool VIP { get; set; }
public decimal Points { get; set; }
}
var rows = MiniExcel.Query<UserAccount>(path);
// or
using (var stream = File.OpenRead(path))
var rows = stream.Query<UserAccount>();
2. Execute a query and map it to a list of dynamic objects without using head [Try it]
- dynamic key is
A.B.C.D..
MiniExcel | 1 |
---|---|
Github | 2 |
var rows = MiniExcel.Query(path).ToList();
// or
using (var stream = File.OpenRead(path))
{
var rows = stream.Query().ToList();
Assert.Equal("MiniExcel", rows[0].A);
Assert.Equal(1, rows[0].B);
Assert.Equal("Github", rows[1].A);
Assert.Equal(2, rows[1].B);
}
3. Execute a query with first header row [Try it]
note : same column name use last right one
Input Excel :
Column1 | Column2 |
---|---|
MiniExcel | 1 |
Github | 2 |
var rows = MiniExcel.Query(useHeaderRow:true).ToList();
// or
using (var stream = File.OpenRead(path))
{
var rows = stream.Query(useHeaderRow:true).ToList();
Assert.Equal("MiniExcel", rows[0].Column1);
Assert.Equal(1, rows[0].Column2);
Assert.Equal("Github", rows[1].Column1);
Assert.Equal(2, rows[1].Column2);
}
4. Query Support LINQ Extension First/Take/Skip ...etc
Query First
var row = MiniExcel.Query(path).First();
Assert.Equal("HelloWorld", row.A);
// or
using (var stream = File.OpenRead(path))
{
var row = stream.Query().First();
Assert.Equal("HelloWorld", row.A);
}
Performance between MiniExcel/ExcelDataReader/ClosedXML/EPPlus
5. Query by sheet name
MiniExcel.Query(path, sheetName: "SheetName");
//or
stream.Query(sheetName: "SheetName");
6. Query all sheet name and rows
var sheetNames = MiniExcel.GetSheetNames(path).ToList();
foreach (var sheetName in sheetNames)
{
var rows = MiniExcel.Query(path, sheetName: sheetName);
}
7. Get Columns
var columns = MiniExcel.GetColumns(path); // e.g result : ["A","B"...]
var cnt = columns.Count; // get column count
8. Dynamic Query cast row to IDictionary<string,object>
foreach(IDictionary<string,object> row in MiniExcel.Query(path))
{
//..
}
Create Excel
-
Must be a non-abstract type with a public parameterless constructor .
-
MiniExcel support parameter IEnumerable Deferred Execution, If you want to use least memory, please do not call methods such as ToList
e.g : ToList or not memory usage
1. Anonymous or strongly type [Try it]
var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx");
MiniExcel.SaveAs(path, new[] {
new { Column1 = "MiniExcel", Column2 = 1 },
new { Column1 = "Github", Column2 = 2}
});
2. Datatable
var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx");
var table = new DataTable();
{
table.Columns.Add("Column1", typeof(string));
table.Columns.Add("Column2", typeof(decimal));
table.Rows.Add("MiniExcel", 1);
table.Rows.Add("Github", 2);
}
MiniExcel.SaveAs(path, table);
3. Dapper
using (var connection = GetConnection(connectionString))
{
var rows = connection.Query(@"select 'MiniExcel' as Column1,1 as Column2 union all select 'Github',2");
MiniExcel.SaveAs(path, rows);
}
4. IEnumerable<IDictionary<string, object>>
var values = new List<Dictionary<string, object>>()
{
new Dictionary<string,object>{{ "Column1", "MiniExcel" }, { "Column2", 1 } },
new Dictionary<string,object>{{ "Column1", "Github" }, { "Column2", 2 } }
};
MiniExcel.SaveAs(path, values);
Create File Result :
Column1 | Column2 |
---|---|
MiniExcel | 1 |
Github | 2 |
5. SaveAs Stream [Try it]
using (var stream = File.Create(path))
{
stream.SaveAs(values);
}
Fill Data To Excel Template
- The declaration is similar to Vue template
{{variable name}}
, or the collection rendering{{collection name.field name}}
- Collection rendering support IEnumerable/DataTable/DapperRow
1. Basic Fill
Code:
// 1. By POCO
var value = new
{
Name = "Jack",
CreateDate = new DateTime(2021, 01, 01),
VIP = true,
Points = 123
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);
// 2. By Dictionary
var value = new Dictionary<string, object>()
{
["Name"] = "Jack",
["CreateDate"] = new DateTime(2021, 01, 01),
["VIP"] = true,
["Points"] = 123
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);
2. IEnumerable Data Fill
Note1: Use the first IEnumerable of the same column as the basis for filling list
Code:
//1. By POCO
var value = new
{
employees = new[] {
new {name="Jack",department="HR"},
new {name="Lisa",department="HR"},
new {name="John",department="HR"},
new {name="Mike",department="IT"},
new {name="Neo",department="IT"},
new {name="Loan",department="IT"}
}
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);
//2. By Dictionary
var value = new Dictionary<string, object>()
{
["employees"] = new[] {
new {name="Jack",department="HR"},
new {name="Lisa",department="HR"},
new {name="John",department="HR"},
new {name="Mike",department="IT"},
new {name="Neo",department="IT"},
new {name="Loan",department="IT"}
}
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);
3. Complex Data Fill
Note: Support multi-sheets and using same varible
Template:
Result:
// 1. By POCO
var value = new
{
title = "FooCompany",
managers = new[] {
new {name="Jack",department="HR"},
new {name="Loan",department="IT"}
},
employees = new[] {
new {name="Wade",department="HR"},
new {name="Felix",department="HR"},
new {name="Eric",department="IT"},
new {name="Keaton",department="IT"}
}
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);
// 2. By Dictionary
var value = new Dictionary<string, object>()
{
["title"] = "FooCompany",
["managers"] = new[] {
new {name="Jack",department="HR"},
new {name="Loan",department="IT"}
},
["employees"] = new[] {
new {name="Wade",department="HR"},
new {name="Felix",department="HR"},
new {name="Eric",department="IT"},
new {name="Keaton",department="IT"}
}
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);
4. Fill Big Data Performance
NOTE: Using IEnumerable deferred execution not ToList can save max memory usage in MiniExcel
Excel Column Name/Index/Ignore Attribute
e.g
input excel :
public class ExcelAttributeDemo
{
[ExcelColumnName("Column1")]
public string Test1 { get; set; }
[ExcelColumnName("Column2")]
public string Test2 { get; set; }
[ExcelIgnore]
public string Test3 { get; set; }
[ExcelColumnIndex("I")] // system will convert "I" to 8 index
public string Test4 { get; set; }
public string Test5 { get; } //wihout set will ignore
public string Test6 { get; private set; } //un-public set will ignore
[ExcelColumnIndex(3)] // start with 0
public string Test7 { get; set; }
}
var rows = MiniExcel.Query<ExcelAttributeDemo>(path).ToList();
Assert.Equal("Column1", rows[0].Test1);
Assert.Equal("Column2", rows[0].Test2);
Assert.Null(rows[0].Test3);
Assert.Equal("Test7", rows[0].Test4);
Assert.Null(rows[0].Test5);
Assert.Null(rows[0].Test6);
Assert.Equal("Test4", rows[0].Test7);
Excel Type Auto Check
Default system will auto check file path or stream is from xlsx or csv, but if you need to specify type, it can use excelType parameter.
stream.SaveAs(excelType:ExcelType.CSV);
//or
stream.SaveAs(excelType:ExcelType.XLSX);
//or
stream.Query(excelType:ExcelType.CSV);
//or
stream.Query(excelType:ExcelType.XLSX);
Examples:
1. SQLite & Dapper Large Size File
SQL Insert Avoid OOM
note : please don't call ToList/ToArray methods after Query, it'll load all data into memory
using (var connection = new SQLiteConnection(connectionString))
{
connection.Open();
using (var transaction = connection.BeginTransaction())
using (var stream = File.OpenRead(path))
{
var rows = stream.Query();
foreach (var row in rows)
connection.Execute("insert into T (A,B) values (@A,@B)", new { row.A, row.B }, transaction: transaction);
transaction.Commit();
}
}
2. ASP.NET Core 3.1 or MVC 5 Download Excel Xlsx API Demo Try it
public class ExcelController : Controller
{
public IActionResult DownloadExcel()
{
var values = new[] {
new { Column1 = "MiniExcel", Column2 = 1 },
new { Column1 = "Github", Column2 = 2}
};
var memoryStream = new MemoryStream();
memoryStream.SaveAs(values);
memoryStream.Seek(0, SeekOrigin.Begin);
return new FileStreamResult(memoryStream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
{
FileDownloadName = "demo.xlsx"
};
}
public IActionResult DownloadExcelFromTmplate()
{
var templatePath = "TestTemplateComplex.xlsx";
var value = new Dictionary<string, object>()
{
["title"] = "FooCompany",
["managers"] = new[] {
new {name="Jack",department="HR"},
new {name="Loan",department="IT"}
},
["employees"] = new[] {
new {name="Wade",department="HR"},
new {name="Felix",department="HR"},
new {name="Eric",department="IT"},
new {name="Keaton",department="IT"}
}
};
var memoryStream = new MemoryStream();
memoryStream.SaveAsByTemplate(templatePath, value);
memoryStream.Seek(0, SeekOrigin.Begin);
return new FileStreamResult(memoryStream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
{
FileDownloadName = "demo.xlsx"
};
}
public IActionResult DownloadExcelFromTmplate_StremVersion()
{
var templatePath = "TestTemplateComplex.xlsx";
var tytes = System.IO.File.ReadAllBytes(templatePath);
var value = new Dictionary<string, object>()
{
["title"] = "FooCompany",
["managers"] = new[] {
new {name="Jack",department="HR"},
new {name="Loan",department="IT"}
},
["employees"] = new[] {
new {name="Wade",department="HR"},
new {name="Felix",department="HR"},
new {name="Eric",department="IT"},
new {name="Keaton",department="IT"}
}
};
var memoryStream = new MemoryStream();
memoryStream.SaveAsByTemplate(tytes, value);
memoryStream.Seek(0, SeekOrigin.Begin);
return new FileStreamResult(memoryStream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
{
FileDownloadName = "demo.xlsx"
};
}
}
3. Paging Query
void Main()
{
var rows = MiniExcel.Query(path);
Console.WriteLine("==== No.1 Page ====");
Console.WriteLine(Page(rows,pageSize:3,page:1));
Console.WriteLine("==== No.50 Page ====");
Console.WriteLine(Page(rows,pageSize:3,page:50));
Console.WriteLine("==== No.5000 Page ====");
Console.WriteLine(Page(rows,pageSize:3,page:5000));
}
public static IEnumerable<T> Page<T>(IEnumerable<T> en, int pageSize, int page)
{
return en.Skip(page * pageSize).Take(pageSize);
}
FQA
Q: How to convert query results to DataTable
Reminder: Not recommended, because DataTable will load all data into memory and lose MiniExcel's low memory consumption function.
public static DataTable QueryAsDataTable(string path)
{
var rows = MiniExcel.Query(path, true);
var dt = new DataTable();
var first = true;
foreach (IDictionary<string, object> row in rows)
{
if (first)
{
foreach (var key in row.Keys)
{
var type = row[key]?.GetType() ?? typeof(string);
dt.Columns.Add(key, type);
}
first = false;
}
dt.Rows.Add(row.Values.ToArray());
}
return dt;
}
Limitations and caveats
- Not support xls and encrypted file now