Every File is a Potential Information Source

Regardless of the Operating System you use, or the applications you run on a given platform – if you want to retain the information you perused or created, you will need to either print a hardcopy of it, or save it in an electronic file conducive to the application used. An ample amount of information can be determined about the data contained in the file if the filename is properly named, and providing you can associate the filename extension with its corresponding application.

Much has been written about file naming conventions and the importance of developing a guideline for their use. The use of such a standard will form the foundation for building and maintaining a corporate Document Management System (DMS). Some DMS’s (such as Synergis Software’s Adept application) can utilize this structure and either assist in naming pertinent files and/or simplifying search criteria when attempting to retrieve archived data.

While the ability to glean basic information about a file (and its parent application) from the filename has some benefit it doesn’t provide detailed information about the file’s contents. If a specific amount of data is needed (i.e., Account number, Asset Description, Engineering data, etc.) then the parent application is launched; the file is loaded; and then the needed information retrieved. While this approach may be suited as a quick means of acquiring a small piece of information it is not an efficient use of time and resources if a large section of embedded data is needing to be integrated from one, or more, files into another application.

As a CAD Manager, the ability to share data between projects as well as between differing applications (i.e., CAD, GIS, EAM, etc.) is often required to ensure that the integrity of a client’s data is maintained and sustainable for future use. The ability to see beyond CAD data and envision how it could be utilized elsewhere was first introduced to me by a colleague (I’ll call him Dave – well… because that’s his name). Dave was the first to introduce the concept to me that an electronic file – moreover a CAD file, is nothing more than a simple, stand-alone, database. And the benefit of any database is that with the proper tools it’s data can be Extracted, Transformed, and then Loaded (ETL) into any other database/file without the use of the parent application. Once this is understood, then a host of possibilities are opened to how a simple CAD file can be used to help maintain GIS data; asset information; accounting information; etc. – and vice versa.

The fact that an AutoCAD DWG file is a basic database is not fresh news. The use of the DXF format proves that a DWG file’s schema is structured to maintain the graphical vectors of a drawing as well as its annotation, attributes, and other metadata. However, the intent of the DXF format is to allow for the exchange of CAD data across differing CAD applications and versions. As such, the parent applications are still required.

As applications continue to evolve many are incorporating ETL technology into their programs to allow for integration with other applications of like-minded use.  For instance, AutoCAD Map 3D, and Civil 3D can directly access a GIS database and GIS can import an AutoCAD drawing. But don’t limit yourself to the industry or team that you are associated with. Keep in mind that every file is a database for its host application, and as such may contain data useful for your project. I would encourage you to consider raster image files (e.g., TIFF, JPEG, GIF); database files (e.g., Oracle, SQL); PDF’s; e-mail; and even web data (to name a few) as potential sources of information that can be integrated into your current project.

The best approach to understanding this concept and to learn how data can be Extracted from a given file; Transformed into a useable format for another application; and Loaded into differing file formats, reference Safe Software’s FME application. (Note: if you look closely to your CAD or GIS licensing agreements you may find that this technology is currently embedded in the application.)

The intent of this brief article is not to advertise any one vendor, but to help you better understand the data that you are charged with maintaining and how it can be integrated to/from other data to enhance your daily tasks.

With all the data that is at your fingertips today don’t limit yourself just because your tools-of-the-trade have limits.