MiniExcel/README.zh-CN.md

23 KiB
Raw Blame History

NuGet Build status .NET Framework .NET Standard .NET


English / 简体中文 / 繁體中文


QQ : 813100564 (MiniExcel讨论群) / 459050029 (dotnet china)


简介

MiniExcel简单、高效避免OOM的.NET处理Excel查、写、填充数据工具。

目前主流框架大多需要将数据全载入到内存方便操作但这会导致内存消耗问题MiniExcel 尝试以 Stream 角度写底层算法逻辑能让原本1000多MB占用降低到几MB避免内存不够情况。

image

特点

  • 低内存耗用避免OOM(out of memoery)、频繁 Full GC 情况
  • 支持即时操作每行数据 miniexcel_lazy_load
  • 兼具搭配 LINQ 延迟查询特性,能办到低消耗、快速分页等复杂查询
  • 轻量不依赖任何套件DLL小于100KB
  • 简便操作的 API 风格

快速开始

安装

请查看 from NuGet

更新日志

请查看 Release Notes

TODO

请查看 TODO

性能测试

Test1,000,000x10.xlsx 做基准与主流框架做性能测试,总共 1千万笔 "HelloWorld",文件大小 23 MB

Benchmarks 逻辑可以在 MiniExcel.Benchmarks 查看或是提交 PR运行指令

dotnet run -p .\benchmarks\MiniExcel.Benchmarks\ -c Release -f netcoreapp3.1 -- -f * --join

最后一次运行结果 :

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT
  Job-ZYYABG : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT
IterationCount=3  LaunchCount=3  WarmupCount=3  
Method 最大内存耗用 平均时间 Gen 0 Gen 1 Gen 2
'MiniExcel QueryFirst' 0.109 MB 726.4 us - - -
'ExcelDataReader QueryFirst' 15.24 MB 10,664,238.2 us 566000.0000 1000.0000 -
'MiniExcel Query' 17.3 MB 14,179,334.8 us 367000.0000 96000.0000 7000.0000
'ExcelDataReader Query' 17.3 MB 22,565,088.7 us 1210000.0000 2000.0000 -
'Epplus QueryFirst' 1,452 MB 18,198,015.4 us 535000.0000 132000.0000 9000.0000
'Epplus Query' 1,451 MB 23,647,471.1 us 1451000.0000 133000.0000 9000.0000
'OpenXmlSDK Query' 1,412 MB 52,003,270.1 us 978000.0000 353000.0000 11000.0000
'OpenXmlSDK QueryFirst' 1,413 MB 52,348,659.1 us 978000.0000 353000.0000 11000.0000
'ClosedXml QueryFirst' 2,158 MB 66,188,979.6 us 2156000.0000 575000.0000 9000.0000
'ClosedXml Query' 2,184 MB 191,434,126.6 us 2165000.0000 577000.0000 10000.0000
Method 最大内存耗用 平均时间 Gen 0 Gen 1 Gen 2
'MiniExcel Create Xlsx' 15 MB 11,531,819.8 us 1020000.0000 - -
'Epplus Create Xlsx' 1,204 MB 22,509,717.7 us 1370000.0000 60000.0000 30000.0000
'OpenXmlSdk Create Xlsx' 2,621 MB 42,473,998.9 us 1370000.0000 460000.0000 50000.0000
'ClosedXml Create Xlsx' 7,141 MB 140,939,928.6 us 5520000.0000 1500000.0000 80000.0000

读 Excel

1. Query 查询 Excel 返回强型别 IEnumerable 数据 [Try it]

public class UserAccount
{
    public Guid ID { get; set; }
    public string Name { get; set; }
    public DateTime BoD { get; set; }
    public int Age { get; set; }
    public bool VIP { get; set; }
    public decimal Points { get; set; }
}

var rows = MiniExcel.Query<UserAccount>(path);

// or

using (var stream = File.OpenRead(path))
    var rows = stream.Query<UserAccount>();

image

2. Query 查询 Excel 返回Dynamic IEnumerable 数据 [Try it]

  • Key 系统预设为 A,B,C,D...Z
MiniExcel 1
Github 2

var rows = MiniExcel.Query(path).ToList();

// or 
using (var stream = File.OpenRead(path))
{
    var rows = stream.Query().ToList();
                
    Assert.Equal("MiniExcel", rows[0].A);
    Assert.Equal(1, rows[0].B);
    Assert.Equal("Github", rows[1].A);
    Assert.Equal(2, rows[1].B);
}

3. 查询数据以第一行数据当Key [Try it]

注意 : 同名以右边数据为准

Input Excel :

Column1 Column2
MiniExcel 1
Github 2

var rows = MiniExcel.Query(useHeaderRow:true).ToList();

// or

using (var stream = File.OpenRead(path))
{
    var rows = stream.Query(useHeaderRow:true).ToList();

    Assert.Equal("MiniExcel", rows[0].Column1);
    Assert.Equal(1, rows[0].Column2);
    Assert.Equal("Github", rows[1].Column1);
    Assert.Equal(2, rows[1].Column2);
}

4. Query 查询支援延迟加载(Deferred Execution)能配合LINQ First/Take/Skip办到低消耗、高效率复杂查询

举例 : 查询第一笔数据

var row = MiniExcel.Query(path).First();
Assert.Equal("HelloWorld", row.A);

// or

using (var stream = File.OpenRead(path))
{
    var row = stream.Query().First();
    Assert.Equal("HelloWorld", row.A);
}

与其他框架效率比较 :

queryfirst

5. 查询指定 Sheet 名称

MiniExcel.Query(path, sheetName: "SheetName");
//or
stream.Query(sheetName: "SheetName");

6. 查询所有 Sheet 名称跟数据

var sheetNames = MiniExcel.GetSheetNames(path).ToList();
foreach (var sheetName in sheetNames)
{
    var rows = MiniExcel.Query(path, sheetName: sheetName);
}

7. 查询所有栏(列)

var columns = MiniExcel.GetColumns(path); // e.g result : ["A","B"...]

var cnt = columns.Count;  // get column count

8. Dynamic Query 转成 IDictionary<string,object> 数据

foreach(IDictionary<string,object> row in MiniExcel.Query(path))
{
    //..
}

写 Excel

  1. 必须是非abstract 类别有公开无参数构造函数
  2. MiniExcel SaveAs 支援 IEnumerable参数延迟查询,除非必要请不要使用 ToList 等方法读取全部数据到内存

图片 : 是否呼叫 ToList 的内存差别

image1. Anonymous or strongly type [Try it]

var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx");
MiniExcel.SaveAs(path, new[] {
    new { Column1 = "MiniExcel", Column2 = 1 },
    new { Column1 = "Github", Column2 = 2}
});

2. Datatable:

  • 优先使用 Caption 当栏位名称
var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx");
var table = new DataTable();
{
    table.Columns.Add("Column1", typeof(string));
    table.Columns.Add("Column2", typeof(decimal));
    table.Rows.Add("MiniExcel", 1);
    table.Rows.Add("Github", 2);
}

MiniExcel.SaveAs(path, table);

3. Dapper

using (var connection = GetConnection(connectionString))
{
    var rows = connection.Query(@"select 'MiniExcel' as Column1,1 as Column2 union all select 'Github',2");
    MiniExcel.SaveAs(path, rows);
}

4. IEnumerable<IDictionary<string, object>>

var values = new List<Dictionary<string, object>>()
{
    new Dictionary<string,object>{{ "Column1", "MiniExcel" }, { "Column2", 1 } },
    new Dictionary<string,object>{{ "Column1", "Github" }, { "Column2", 2 } }
};
MiniExcel.SaveAs(path, values);

output :

Column1 Column2
MiniExcel 1
Github 2

5. SaveAs 支援 Stream [Try it]

using (var stream = File.Create(path))
{
    stream.SaveAs(values);
}

模板填充 Excel

  • 宣告方式类似 Vue 模板 {{变量名称}}, 或是集合渲染 {{集合名称.栏位名称}}
  • 集合渲染支持 IEnumerable/DataTable/DapperRow

1. 基本填充

模板:
image

最终效果:
image

代码:

// 1. By POCO
var value = new
{
    Name = "Jack",
    CreateDate = new DateTime(2021, 01, 01),
    VIP = true,
    Points = 123
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);


// 2. By Dictionary
var value = new Dictionary<string, object>()
{
    ["Name"] = "Jack",
    ["CreateDate"] = new DateTime(2021, 01, 01),
    ["VIP"] = true,
    ["Points"] = 123
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

2. IEnumerable 数据填充

Note1: 同行从左往右以第一个 IEnumerableUse 当列表来源 (不支持同列多集合)

模板:
image

最终效果:
image

代码:

//1. By POCO
var value = new
{
    employees = new[] {
        new {name="Jack",department="HR"},
        new {name="Lisa",department="HR"},
        new {name="John",department="HR"},
        new {name="Mike",department="IT"},
        new {name="Neo",department="IT"},
        new {name="Loan",department="IT"}
    }
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

//2. By Dictionary
var value = new Dictionary<string, object>()
{
    ["employees"] = new[] {
        new {name="Jack",department="HR"},
        new {name="Lisa",department="HR"},
        new {name="John",department="HR"},
        new {name="Mike",department="IT"},
        new {name="Neo",department="IT"},
        new {name="Loan",department="IT"}
    }
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

3. 复杂数据填充

Note: 支持多 sheet 填充,并共用同一组参数

模板:

image

最终效果:

image

代码:

// 1. By POCO
var value = new
{
    title = "FooCompany",
    managers = new[] {
        new {name="Jack",department="HR"},
        new {name="Loan",department="IT"}
    },
    employees = new[] {
        new {name="Wade",department="HR"},
        new {name="Felix",department="HR"},
        new {name="Eric",department="IT"},
        new {name="Keaton",department="IT"}
    }
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

// 2. By Dictionary
var value = new Dictionary<string, object>()
{
    ["title"] = "FooCompany",
    ["managers"] = new[] {
        new {name="Jack",department="HR"},
        new {name="Loan",department="IT"}
    },
    ["employees"] = new[] {
        new {name="Wade",department="HR"},
        new {name="Felix",department="HR"},
        new {name="Eric",department="IT"},
        new {name="Keaton",department="IT"}
    }
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

4. 大数据填充效率比较

NOTE: 在 MiniExcel 使用 IEnumerable 延迟 ( 不ToList ) 可以节省内存使用

image

5. Cell 值自动类别对应

模板

image

最终效果

image

类别

public class Poco
{
    public string @string { get; set; }
    public int? @int { get; set; }
    public decimal? @decimal { get; set; }
    public double? @double { get; set; }
    public DateTime? datetime { get; set; }
    public bool? @bool { get; set; }
    public Guid? Guid { get; set; }
}

代码

var poco = new TestIEnumerableTypePoco { @string = "string", @int = 123, @decimal = decimal.Parse("123.45"), @double = (double)123.33, @datetime = new DateTime(2021, 4, 1), @bool = true, @Guid = Guid.NewGuid() };
var value = new
{
    Ts = new[] {
        poco,
        new TestIEnumerableTypePoco{},
        null,
        poco
    }
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

6. Example : 列出 Github 专案

模板

image

最终效果

image

代码

var projects = new[]
{
    new {Name = "MiniExcel",Link="https://github.com/shps951023/MiniExcel",Star=146, CreateTime=new DateTime(2021,03,01)},
    new {Name = "HtmlTableHelper",Link="https://github.com/shps951023/HtmlTableHelper",Star=16, CreateTime=new DateTime(2020,02,01)},
    new {Name = "PocoClassGenerator",Link="https://github.com/shps951023/PocoClassGenerator",Star=16, CreateTime=new DateTime(2019,03,17)}
};
var value = new
{
    User = "ITWeiHan",
    Projects = projects,
    TotalStar = projects.Sum(s => s.Star)
};
MiniExcel.SaveAsByTemplate(path, templatePath, value);

Excel Column Name/Index/Ignore Attribute

e.g

input excel :

image

public class ExcelAttributeDemo
{
    [ExcelColumnName("Column1")]
    public string Test1 { get; set; }
    [ExcelColumnName("Column2")]
    public string Test2 { get; set; }
    [ExcelIgnore]
    public string Test3 { get; set; }
    [ExcelColumnIndex("I")] // system will convert "I" to 8 index
    public string Test4 { get; set; } 
    public string Test5 { get; } //wihout set will ignore
    public string Test6 { get; private set; } //un-public set will ignore
    [ExcelColumnIndex(3)] // start with 0
    public string Test7 { get; set; }
}

var rows = MiniExcel.Query<ExcelAttributeDemo>(path).ToList();
Assert.Equal("Column1", rows[0].Test1);
Assert.Equal("Column2", rows[0].Test2);
Assert.Null(rows[0].Test3);
Assert.Equal("Test7", rows[0].Test4);
Assert.Null(rows[0].Test5);
Assert.Null(rows[0].Test6);
Assert.Equal("Test4", rows[0].Test7);

Excel 类别自动判断

MiniExcel 预设会根据扩展名或是 Stream 类别判断是 xlsx 还是 csv但会有失准时候请自行指定。

stream.SaveAs(excelType:ExcelType.CSV);
//or
stream.SaveAs(excelType:ExcelType.XLSX);
//or
stream.Query(excelType:ExcelType.CSV);
//or
stream.Query(excelType:ExcelType.XLSX);

范例

1. SQLite & Dapper 读取大数据新增到数据库

note : 请不要呼叫 call ToList/ToArray 等方法,这会将所有数据读到内存内

using (var connection = new SQLiteConnection(connectionString))
{
    connection.Open();
    using (var transaction = connection.BeginTransaction())
    using (var stream = File.OpenRead(path))
    {
       var rows = stream.Query();
       foreach (var row in rows)
             connection.Execute("insert into T (A,B) values (@A,@B)", new { row.A, row.B }, transaction: transaction);
       transaction.Commit();
    }
}

效能: image

2. ASP.NET Core 3.1 or MVC 5 下载 Excel Xlsx API Demo Try it

public class HomeController : Controller
{
    public IActionResult Index()
    {
        return new ContentResult
        {
            ContentType = "text/html",
            StatusCode = (int)HttpStatusCode.OK,
            Content = @"<html><body>
<a href='Home/DownloadExcel'>DownloadExcel</a><br>
<a href='Home/DownloadExcelFromTemplatePath'>DownloadExcelFromTemplatePath</a><br>
<a href='Home/DownloadExcelFromTemplateBytes'>DownloadExcelFromTemplateBytes</a>
</body></html>"
        };
    }

    public IActionResult DownloadExcel()
    {
        var values = new[] {
            new { Column1 = "MiniExcel", Column2 = 1 },
            new { Column1 = "Github", Column2 = 2}
        };
        var memoryStream = new MemoryStream();
        memoryStream.SaveAs(values);
        memoryStream.Seek(0, SeekOrigin.Begin);
        return new FileStreamResult(memoryStream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
        {
            FileDownloadName = "demo.xlsx"
        };
    }

    public IActionResult DownloadExcelFromTemplatePath()
    {
        string templatePath = "TestTemplateComplex.xlsx";

        Dictionary<string, object> value = new Dictionary<string, object>()
        {
            ["title"] = "FooCompany",
            ["managers"] = new[] {
                new {name="Jack",department="HR"},
                new {name="Loan",department="IT"}
            },
            ["employees"] = new[] {
                new {name="Wade",department="HR"},
                new {name="Felix",department="HR"},
                new {name="Eric",department="IT"},
                new {name="Keaton",department="IT"}
            }
        };

        MemoryStream memoryStream = new MemoryStream();
        memoryStream.SaveAsByTemplate(templatePath, value);
        memoryStream.Seek(0, SeekOrigin.Begin);
        return new FileStreamResult(memoryStream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
        {
            FileDownloadName = "demo.xlsx"
        };
    }

    private static Dictionary<string, Byte[]> TemplateBytesCache = new Dictionary<string, byte[]>();

    static HomeController()
    {
        string templatePath = "TestTemplateComplex.xlsx";
        byte[] bytes = System.IO.File.ReadAllBytes(templatePath);
        TemplateBytesCache.Add(templatePath, bytes);
    }

    public IActionResult DownloadExcelFromTemplateBytes()
    {
        byte[] bytes = TemplateBytesCache["TestTemplateComplex.xlsx"];

        Dictionary<string, object> value = new Dictionary<string, object>()
        {
            ["title"] = "FooCompany",
            ["managers"] = new[] {
                new {name="Jack",department="HR"},
                new {name="Loan",department="IT"}
            },
            ["employees"] = new[] {
                new {name="Wade",department="HR"},
                new {name="Felix",department="HR"},
                new {name="Eric",department="IT"},
                new {name="Keaton",department="IT"}
            }
        };

        MemoryStream memoryStream = new MemoryStream();
        memoryStream.SaveAsByTemplate(bytes, value);
        memoryStream.Seek(0, SeekOrigin.Begin);
        return new FileStreamResult(memoryStream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
        {
            FileDownloadName = "demo.xlsx"
        };
    }
}

3. 分页查询

void Main()
{
	var rows = MiniExcel.Query(path);
	
	Console.WriteLine("==== No.1 Page ====");
	Console.WriteLine(Page(rows,pageSize:3,page:1));
	Console.WriteLine("==== No.50 Page ====");
	Console.WriteLine(Page(rows,pageSize:3,page:50));
	Console.WriteLine("==== No.5000 Page ====");
	Console.WriteLine(Page(rows,pageSize:3,page:5000));
}

public static IEnumerable<T> Page<T>(IEnumerable<T> en, int pageSize, int page)
{
	return en.Skip(page * pageSize).Take(pageSize);
}

20210419

FQA

Q: 如何将查询结果转为 DataTable

提醒 : 不建议使用因为DataTable会将数据全载入内存失去MiniExcel低内存消耗功能。

public static DataTable QueryAsDataTable(string path)
{
	var rows = MiniExcel.Query(path, true);
	var dt = new DataTable();
	var first = true;
	foreach (IDictionary<string, object> row in rows)
	{
		if (first)
		{
			foreach (var key in row.Keys)
			{
				var type = row[key]?.GetType() ?? typeof(string);
				dt.Columns.Add(key, type);
			}

			first = false;
		}
		dt.Rows.Add(row.Values.ToArray());
	}
	return dt;
}

image

Q: Excel 表头标题名称跟 class 属性名称不一致,如何对应?

A. 请使用 ExcelColumnName 作 mapping

image

局限与警告

  • 目前不支援 xls (97-2003) 或是加密文件。

参考

Contributors