FlexCel Reports Developer Guide
Introduction
This document is a part of a 2-part description on how to create Excel files by “reporting” instead of with code. In this part we will look at how to set up the coded needed to create the files, and in the next part FlexCel Reports Designer Guide we will look at how to modify the Excel file used as a template to create the reports.
About Excel Reporting
FlexCel gives you two ways to create an Excel file:
- With the API using the class XlsFile
- With a templating system using the class FlexCelReport .
Each method has its good and bad things, and it is good to know the advantages and drawbacks of each.
Creating a report using XlsFile is a low-level approach. As with most lower level approaches, it will be very fast to run (if coded right!), and you will have access to the entire API, so you can do whatever you can do with FlexCelReport and more. After all, FlexCelReport uses XlsFile internally to work its magic.
Whatever you can do in FlexCelReport, you can do it with the API.
But you need to take a lot of care when coding directly in the API: XlsFile reports can be a nightmare to maintain if not coded correctly. When you reach enough number of lines like:
xls.SetCellValue(3, 4, "Title");
var fmt = xls.GetFormat(xls.GetCellFormat(3, 4));
fmt.Font.Name = "Helvetica";
fmt.Font.Size20 = 14 * 20;
int XF = xls.AddFormat(fmt);
xls.SetCellFormat(3, 4, XF);
changing the report can become quite difficult. Imagine the user wants to insert a column with the expenses (and don't ask why, users always want to insert a column).
Now you should change the line:
xls.SetCellFormat(3, 4, XF);
to
xls.SetCellFormat(3, 5, XF);
But wait! You need to change also all references to column 5 to 6, from 6 to 7... If the report is complex enough, (for example you have a master-detail) this will be no fun at all.
But there is something much worse with using XlsFile directly. And this is that only the author can change the report. If your user wants to change the logo to a new one, or maybe make column C a little wider, he needs to call you and you need to recompile the application and send him a new executable.
FlexCelReport is a higher level approach. The design is cleanly separated on three different layers, data layer, interface layer and presentation layer, and most of the work is done on the presentation layer, with Excel. You design the report visually on Excel, and you mess as little as possible with the data layer. If the user wants to change the logo, he can just open the template and change the logo. If he wants to insert a column, he can just open the template and insert a column. And the application does not even need to be recompiled.
As with any higher level approach, it is slower than using the API directly, and there are some things where it is more limited. But all in all, reports are really fast too and the things you cannot do are in many cases not worth doing anyway.
So, the option is yours.
Organization of a FlexCel Report
A FlexCel report can be seen as three different modules working together to create the Excel file. Different from a “Coded” report where there is not a clear separation between data access and presentation, here each part has its own place, and can be developed separately by different people with different skills.
Data Layer
This is the layer that contains the data model of the information we want to send out. The data might be stored at a database, or in lists of objects in memory.
Interface Layer
This layer works as the glue between the data and the presentation layers. It has to prepare the data in a way that the presentation layer can easily consume.
Presentation Layer
This is the most complex and changing layer of the three. Here is where you design all the visual aspects of the report, like data position, fonts, colors, etc.
The big advantage of this “layering” is that they are somehow independent, so you can work on them at the same time. Also, Data and Interface layers are small and do not change much on the lifetime of the application. Presentation does change a lot, but it is done completely in Excel, so there is no need to recompile the application each time a cell color or a position changes.
Note
The data flow goes from the Presentation layer to the Data layer and back. It is the presentation that asks FlexCel for the data it needs, and FlexCel that in turn asks for the data. It is not the application that tells FlexCel the data it needs (As in SetCellValue(xx) ), but FlexCel that will ask for the data to the application when it needs it.
On this document we are going to speak about Data and Interface layers. The Presentation layer is discussed on a different document, FlexCel Reports Designer Guide because it is complex enough to deserve it, and because it might be changed by someone who only understands Excel, not .NET. And we don't want to force him to read this document too.
Data Layer
The objective of the data Layer is to have all the data ready for the report when it needs it. Currently, there are six ways to provide the data:
Via Report Variables. Report variables are added to FlexCel by using FlexCelReport.SetValue”. You can define as many as you want, and they are useful for passing constant data to the report, like the date, etc. On the presentation side you will write <#ReportVarName> each time you want to access a report variable.
Via User defined functions: User defined functions are classes you define on your code that allow for specific needs on the presentation layer that it can't handle alone. For example, you might define a user function named “NumberToString(int number) that will return “One” when number=1, “Two” when number =2 and so on. On the presentation layer you would write <#NumberToString(2)> to obtain the string “Two”
Via DataSets: .NET DataSets are a collection of memory tables that can be sorted, filtered or manipulated. Whatever format your data is, you can use it to fill a DataTable and then use that data on FlexCel.
Via IEnumerable: If you have business objects that implement IEnumerable (or even better IQueryable) and you are in .NET 3.5 or up, you can use them directly with FlexCel. Just call FlexCelReport.AddTable(“name”, YourCollection) and you are ready to go.
Via Direct SQL in the template. For maximum flexibility, you can set up the data layer on the template too. In this case, you need to add a database connection to FlexCel, and all the rest is done on the template. This allows the users to completely edit the reports, even the data layer from the template, but also it allows users to do things they might not be intended to. Use it with care.
Via Virtual Datasets: If your business objects don’t implement IQueryable, and the overhead of copying your existing objects to datasets is too much, you can implement your own wrapper objects to provide this data to FlexCel without copying it. But please take this option with care, as it might be simpler and more performing to just dump your objects into a DataSet and use that. FlexCel has an optimized implementation of a virtual dataset for DataSets, and if you create your own virtual datasets you will have to implement a similar level of performance (and functionality). Please read the Appendix II for a more detailed description of virtual datasets.
Linq based DataSources
With FlexCel you can use any object that implements IEnumerable<T> as a datasource for a report (in .NET 3.5 or newer). Internally, FlexCel will use Linq to query the objects and read the results.
To use Linq, you just need a collection of objects. For every object in the collection, you can use all the public properties as fields in the report. So if for example you have the following class:
class MyObject
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
And the list of MyObject:
var MyObjectList = new List<MyObject>();
You can add this collection as a datasource for the report with the line: FlexCelReport.AddTable("MyName", Objs)
And then in the template write <#MyName.FirstName> and < #MyName.LastName>inside a “__MyName__” range to export the collection.
You can also use Entity Framework collections or any other of collection that implements IEnumerable as datasource.
Note
Record count: When FlexCel runs a report from an IEnumerable datasource, it needs to know the count of objects in advance, so it can insert the rows before filling the data. For normal collections like List<Object> this isn’t a problem since they provide a “Count” method that will be used by Linq and it is very fast.
For Entity Framework collections, two different SQLs will be sent to the server, once to get the record count and the second to actually fetch the records. Other collections could have no way to know the record count in advance, and in those cases Linq will loop over all objects in the collection to get the count. In this third case, it will be faster if you provide the record count yourself. Look at the Performance guide to see how you can supply the record count.
Note
Transactions: When you are running a report that access the database (like a report from Entity Framework objects) you must be sure that the database isn’t updated while the report is running.
As said in the note above, in database reports FlexCel sends two SQLs to the server for every table, the first to get the count and the second to get the data. If when you run the second SQL the number of records changed, you will see weird things in the report as the number of rows inserted won’t be the same as the number of records written to the report. Also for master-detail and even a single table, you must make sure that data doesn’t change while it is being fetched. To ensure this, the report should be run inside a “Snapshot” or “Serializable” transaction.
Snapshot transactions are preferred if supported because they don’t block other users from modifying the tables while the report runs. Please take a look at the Performance guide and the Entity Framework demo for more information about transactions.
Dataset based DataSources
Besides Linq, FlexCel also allows using .NET DataSets as datasources. DataSets are less flexible than LINQ datasources and can use more memory, but on the other hand, they can be faster than Linq in many cases. Also, you might have already your data in DataSets, and in this case nothing will be faster than a direct report from it.
The reason why it uses more memory is that it has to fill the dataset before running the report, loading all the data into memory. The reason it can be faster many times is that it loads all the data just once from the database, and then works from memory, avoiding further access to the database which can be slow.
To use a DataSet in a report, just add the dataset or the tables with:
FlexCelReport.AddTable(DataSet);
Or any of the similar AddTable overloads.
As with LINQ, if there is a chance that the data can be modified while you are filling the datasets, those datasets must be filled inside a “Snapshot” or “Serializable” transaction. An advantage of datasets here is that those transactions will complete faster. Once the data is loaded in the dataset, you can run the report without worries about other users changing the data, as it is all loaded in memory.
Data Relationships
FlexCel supports Master-Detail reports, where you have a “master” datasource and a “detail” datasource where for every master record you have different detail records. You could have an “employee” table and for every employee, the orders they sold.
For these kinds of reports, you must tell FlexCel which is the master and which is the detail. Also, you must tell FlexCel how those two tables are related; for example, both tables could be related by an “EmployeeId” field that is used to know which orders correspond to which employee.
You can specify those relationships in two ways:
Implicit Relationships
When the individual objects inside a collection of objects contain a public property that is itself a collection of objects, the second collection is implicitly a detail of the first. For example, if you have the class:
class Employee
{
public string FirstName { get; set; }
public string LastName { get; set; }
public List<Order> Orders { get; set; }
}
Then “Orders” is an implicit detail of “Employee” and when you create a Master-Detail report, FlexCel will output the correct orders for every employee.
Implicit relationships are used in
Entity Framework, and are a simple
way to provide master-detail relationships if you have your objects
organized this way.
Explicit Relationships
Sometimes (as it happens with DataSets) you don’t have implicit relationships in your data. You might have two different Lists of objects:
List<Employee2> Employees;
List<Order2> Orders;
Where the objects are defined as:
class Employee2
{
public int EmployeeId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
class Order2
{
public int EmployeeId { get; set; }
public int OrderId { get; set; }
public string OrderName { get; set; }
}
And related by an Id (in this case EmployeeId). If you were to create a report out of those two lists, you would get the full list of orders for every Employee. You need to tell FlexCel that the “Employees” collection is related to the “Orders” collection by the EmployeeId.
You can add explicit relationships with
FlexCelReport.AddRelationship(...)
But you will almost never need to do so. FlexCel automatically loads all existing Data Relationships in datasets, so if the tables on your dataset are related, then you don’t need to do anything else.
For other objects different from datasets you will probably want to use implicit relationships. But there might be cases where you need explicit ones. Explicit relationships are more powerful than implicit, in the sense that you can have the same tables related in different ways. For example, The “Orders” collection could have also an explicit relationship with a “Date” table. If you run the report “Orders by Date” then that relationship will be used. If you run the report “Orders by Employee” then the employee relation will be used. If you had implicit relationships, then both “Date” and “Employee” collections would need an “Orders” field, and that could lead to repeated data.
Explicit data relationships can be defined with any objects, even between DataSets and and LINQ collections. Whenever FlexCel finds an explicit Data Relationship, it will filter the detail table to show the correct records for the master.
Tip
You might also define the relationships directly in the template instead of by code. This is useful when using Direct SQL, since the tables are defined directly in the Excel template and you can’t set up the relationships in code. Also, you can add custom relationships not related to datasets, when using virtual datasets.
Interface Layer
This is the simplest of the three, as all work is done internally. To set up the interface layer, you only need to tell FlexCel which datasets, tables and user functions it has available from the data layer. For this, you use “SetValue”, “SetUserFunction”, “AddTable” and “AddConnection” methods on FlexCelReport. Once you have told FlexCel what it can use, just call FlexCelReport.Run and this is all.
Note
It is worth spending some time thinking about what tables, functions and variables from the data layer you want to make available for the final user. Limiting it too much might mean cutting the power of the user to customize his report. And just adding everything might be too inefficient, since you are loading thousands of rows the user is not going to use, and might also have some security issues, as the user might get access to tables he is not expected to. You can also use SQL on the templates to allow maximum flexibility, but then you need to be extra careful about security.
Appendix: Virtual DataSets
This is an advanced topic, targeted to experienced developers with really specific needs.
As explained earlier, the easiest way to provide data to FlexCel is via IQueryable collections or DataSets. Both methods are fast, optimized, with lots of added functionality, like data relationships, filtering or sorting, and if you are experiencing issues with them, you are probably better using them.
But you might want to use your own objects directly, and they might not be stored in any of those containers.
You can do this by writing wrapper objects around your data, implementing the abstract classes **** and ****. In fact, standard IQueryable collections and Datasets also interoperate with FlexCel by implementing those classes.
Note
If you are curious and have FlexCel source code, you can search for TLinqDataTable, TLinqDataTableState, TAdoDotNetDataTable and TAdoDotNetDataTableState to see how they are implemented.
We have two classes to implement:
On one side we have . It is a “stateless” container, much like a dataset. Each virtual dataset corresponds with a table on the data layer that you would add with “FlexCelReport.AddTable”, or create by filtering existing datasets on the config sheet.
On the other side we have . Each VirtualDataTableState corresponds with a band on the presentation layer, and if 2 bands share the same data source they will have 2 different VirtualDataTableState objects associated (but a single shared VirtualDataTable).
This is probably easier to visualize with an example. Let's imagine we have the following report:
And this code:
FlexCelReport.AddTable(Customers);
FlexCelReport.AddTable(Company);
FlexCelReport.Run();
There are two VirtualDataTables here (Company and Customers), and three VirtualDataTableStates (One for the Company Band, one for the first Customers Band and one for the second Customers Band)
Warning
Take note that the same VirtualDataTable is used by two different VirtualDataTableStates, and might be used by other VirtualDataTableStates in other threads.
This is why you cannot save any “state” information on the VirtualDataTable, and if you write to any private variable inside of it (for example to keep a cache) you should use locks to avoid threading issues.
Always assume that some other class might be reading your data.
VirtualDataTableState on the other hand is a simple class that will not be accessed by more than one class at the time, and you can do whatever you want inside it without worries of other threads trying to access it.
Creating a VirtualDataTable descendant
The first step into creating our own data access is to create a VirtualDataTable descendant and override its methods. You do not need to implement every method to make it work, just the ones that provide the functionality you want to give to your end users.
VirtualDataTable Required Methods:
On every DataTable you define, you need to implement at least the following methods:
GetColumn, GetColumnCaption, GetColumnName and ColumnCount:
Those methods define the “columns” of your dataset, and the fields you can write on the <#dataset.field> tags on the template.
CreateState: It allows FlexCel to create VirtualDataTableState instances of this DataTable for each band. You will not create VirtualDataTableState instances directly on your user code, FlexCel will create them using this method.
VirtualDataTable Optional Methods:
Now, depending on the functionality you want to provide to the end user, you might want to implement the following methods:
FilterData: Will return a new VirtualDataTable with the filtered data.
You need to implement this method if you want to provide the user with the ability to create new datasets by filtering existing ones on the config sheet. If you do not implement it, any attempt to create a filtered dataset on the config sheet will raise an exception.
Also when FlexCel needs to create a dataset that is a copy of the existing one (for example for sorting it) it will call FilterData with rowFilter null. So even if you don’t implement filtering, it is normally a good idea to at least implement the case for “rowFilter” = null and return a clone of the datatable.
Note that this only applies to standard filters. For <#Distinct()> or <#Split> filters you do not need to implement this.
GetDistinct: Will return a new VirtualDataTable with only unique records.
Implement this method if you want to let your user write <#Distinct()> filters on the config sheet.
LookUp: Will look for a record on the dataset, and return the corresponding value. This method allows the user to use <#Lookup()> tags on their reports.
GetRelationWith: Use this method to return implicit relationships between your data tables. For example the VirtualDataTable implementation of Datasets uses this method to return the ADO.NET DataRelations between datasets. The Linq implementation returns as related any nested dataset.
Creating a VirtualDataTableState descendant:
Again, you do not need to implement every method in this class, and the methods you don't implement will just reduce functionality.
VirtualDataTableState Required Methods:
RowCount: Here you will tell FlexCel how many records this dataset has on its current state. Remember that this might not be the total record count. If for example you are on a master-detail relationship, and the master is on record 'A', you need to return the count of records that correspond with 'A'. Make sure this method is fast, since it is called a lot of times.
GetValue: Here you will finally tell FlexCel what is the value of the data at a given row and column. You can know the row by reading the “Position” property, and the column is a parameter to the method.
As with RowCount, this method should return the records on a specific state, not all the records on the datatable. If you are in a master-detail relationship and only two records of detail correspond the master position, GetValue(position = 0) should return the first record, and GetValue(Position = 1) should return the second. It doesn't matter if the total number of records is 5000.
VirtualDataTableState Optional Methods:
MoveMasterRecord: This method is called each time the master changes its position when you are on a master-detail relationship. You should use it to “filter” the data and cache the records that you will need to return on RowCount and GetValue. For example, when the master table moves to record “B”, you should find out here all the records that apply to “B”, and cache them somewhere where RowCount and GetValue can read them.
You should probably create indexes on the data on the constructor of this class, so MoveMasterRecord can be fast finding the information it needs.
You do not need to implement this method if the VirtualDataTable is not going to be used on Master-Detail relationships or Split relationships.
FilteredRowCount: This method returns the total count of records for the current state (similar to RowCount), but, without considering Split conditions.
If the dataset has 200 records, of which only 20 apply for the current master record, and you have a Split of 5 records, RowCount will return 5 and FilteredRowCount will return 20.
This method is used by the Master Split table to know how much records it should have. In the last example, with a FilteredRowCount of 20 and a split every 5 records, you need 4 master records. You do not need to implement this method if you do not want to provide “Split” functionality.
MoveFirst/MoveNext: Implement these methods if you want to do something in your data when FlexCel moves the active record. For example, the LINQ implementation uses those methods to move the IEnumerator. The dataset implementation does nothing as it doesn’t need to track the changes in position.
EOF: Return true if the dataset is at the last record. If you don’t implement this method, the default implementation is to check Position == RecordCount. But RecordCount might be expensive to calculate, and in this case, if you explicitly implement this method, it will work faster.
Finally
When you have defined both classes, you need to create instances of your VirtualDataTable, and add them to FlexCel with FlexCelReport.AddTable.
For examples of how the whole process is done, please take a look at the Virtual DataSet demo.