PPTGen - Data-driven powerpoints for automated narration
Submitted by Sanjay Yadav (@sanjay15) on Thursday, 31 August 2017
Microsoft PowerPoint (PPT) is a robust tool in the enterprise domain to create business presentations, design visualizations and narrate data stories. Despite an extended list of support for objects, not all data visualizations are supported natively in PPT. PPTGen is a platform independent, powerful library to create/modify powerpoint objects and create PowerPoint presentations based on customer datasets.
Web dashboards built on real-world data in the enterprise world are a ubiquitous solution to present use cases via business presentations across all hierarchies in organizations. Each dashboard view, generally, presents a group of visualizations which present a data story. Usually, screenshots of different dashboard views are created by hand or an automated approach which are then stitched together for an important meeting. This approach has several limitations: screenshot resolution, inability to edit the chart objects and manual effort. PPTGen automates this process.
Each powerpoint (PPT) is a group of XML objects which can be modified programmatically to suit a business need based on a specific dataset (flat file or remote/local database). Here, to create a new data visualization or modify an existing PPT-supported chart template we modify the native PPT object properties according to the data.
PPTGen is a library that is platform independent (Windows, Unix supported machines, Mac OS) and creates spotless data-driven powerpoints. This is accomplished using python, python-pptx, Tornado – a template engine. A YAML configuration file drives the process where data attributes can be defined and tagged to specific identifiers of different PPT objects.
I work as a data scientist at Gramener and part of the product/CTO team. Previously, my work on election sentiment analysis won me a silver medal at SRM university in 2014.