Dynamically generating and sending html emails using ETL(Informatica)
Sending status emails from ETL is a very common practice in data warehouse projects. email tasks are available in all ETL tools which makes this task much easier. Normally, content of these emails are dynamic and created using Unix scripts or in some case ETL itself. Common ETL generated summary emails include
- Error reports
- ETA and job/load status
- Data warehouse/ Mart load completion
- Database/Server capacity alerts
- Summary reports in email
Most of these emails are send to business group or IT project support itself. These emails are formatted and the real data is send as attachments(.csv,.xls ,.txt etc).
Here, I am demonstrating a method to compose and send html emails using ETL and Unix command. I am deviating from usual method of sending emails with attachment and instead writing the attachment content into email body itself. Advanatge of html email is that , the data can be visually represented in better way. For e.g. success status can be highlighted in green and alerts or failures can be marked and highlighted.
The code is done in Informatica Powercenter with a simple mapping using an expression transformation. With a very little understanding of html, data read from source is enclosed with html tags. html tags used here are very basic (table, bold , bgcolor etc). Visit w3schools to learn more about basic html.
In this example , data is read from classic employee file and a summary html email is sent. Employees who has experience more 5 than years is highlighted in green. More meaningful summary can be sent based on your requirements. Sample file used as source shown below.
Informatica mapping has a source, source qualifier, expression transformation and target. Target is a flat file with column defined as text with enough precision to write entire email subject and body.
Mapping expression transformation is used to dynamically generate the body of the email. Expression transformation has the capability to create variable ports just like variables in any other programming languages. Variable ports are created here to
- Write a static header (explained down)
- Write a static footer
- Write dynamic content
- A variable to set background columns conditionally (here years of experience greater than 5 highlighted green)
- A counter variable to get the first and last row of the source. When counter variable is set as 1 header information written to target flat file and when counter is last then footer information is written. For any other counter value data from source file is written.
Header port (v_header) is used to write static information like email subject, sender and receiver information, changing email format to html not plain text. To display data in a table format, table tags are used. Along with other header information , table header is written to flat file. Table headers here is employee id, employee name and years of experience. Header variable port code goes like
'Subject: Sample html email'||chr(010) ||'FROM: Tech_Support_Team'||chr(010)||'To: Business_Team'||chr(010)||'Content-Type: text/html; charset=us-ascii'||chr(010)||'<!doctype html public "-//w3c//dtd html 4.0 transitional//en">'||chr(010)||'<html>'||chr(010)|| '<table border="1">' || '<tr bgcolor="#9CD8FC">'||'<td><b>Employee ID</b></td><td><b>Employee Name</b></td><td><b>Years of Experience</b></td></tr>'||chr(010)
Dynamic data(v_data) read from source file is wrapped around with html tags table row (tr) and columns (td). The file data i.e. employee id , name and years of experience are read from file and written in tabular format
'<tr><td>'|| EMP_ID|| '</td><td <td >'|| EMP_NAME || '</td><td bgcolor="'|| v_color ||'" > '|| YEARS_OF_EXP||'</td></tr>'||chr(010)
Footer port(v_Footer) is used to write closing html tags
A counter variable port v_counter is used to increment, this is used to identify first and last row from the file.
A variable port called v_color is used to conditionally set color. If the years of experience is greater than 5 the variable is set as green. This variable is used in v_data port to set background color.
Output port(Out_data) is used to write the data into target flat file. Output port code looks like below. This will write header concatenated with the data when the first row is read and write data concatenated with footer when last rows reads out of source. In this example, we are expecting only 4 rows. So footer data is written when 4th row comes.
This is how the output generated file looks like.
The email is send using Unix command. sendmail or mailx command can be used to send emails. command is invoked from post session command tasks in Informatica session manager
/usr/lib/sendmail 'firstname.lastname@example.org'< ff_line.out
This is the final summary email send across.