Welcome!

Big Data Journal Authors: Elizabeth White, Liz McMillan, Roger Strukhoff, Adrian Bridgwater, Pat Romanski

Related Topics: Big Data Journal, Cloud Expo, Security, GovIT

Big Data Journal: Article

Trends in Federal Records Management

Three Principles for Successful Federal Records Management

The below is summary of my comments provided on Wednesday, January 29, 2014, at the Alfresco Content.Gov event in Washington, DC.

In my 27 years of federal service, I've watched the growth in federal records and the implementation of new executive orders and regulations aimed at improving records management across the federal space. There are immense challenges associated with litigation, review and release, tracing factual evidence for analysis, managing information legal proceedings, and overseeing a plethora of authorized and unauthorized disclosures of classified and/or sensitive information.

Federal records management professionals are true, unsung heroes in helping our nation protect information while also protecting the civil liberties and privacy of our nation's citizens. The job has become increasingly more difficult in today's era of "big data."  Records management and information management in the 1980s was hard and that's when we thought big data was hundreds of gigabytes. As we consider today's generation of data, four (4) decades later, federal records professionals are charged with managing tens of thousands of gigabytes-petabytes and zettabytes of data. It's an especially daunting task.

Three principles for records management are critical to future success for the federal space:

  1. Capture on creation;
  2. Manage and secure through the workflow; and
  3. Archive responsibly.

Point 1: Capture on Creation
The federal workforce creates content every second of every day. The content is created in formal and informal ways.  It's an email, a meeting maker, an instant message communication, a voice communication, a VTC session, PowerPoint deck, meeting minutes, collaborative engagement session, memorandum, written paper, analytic notes, and so forth.

The federal workforce stores this created content in just as many formal and informal ways.  It's stored on local hard drives, mobile phones, corporate storage, shadow IT storage, public clouds, and private clouds.

In short...it's a mess for the records management professional.

What is needed are solid systems and capabilities that demand capture on content creation.  Simplistic and non-intrusive ways to drive the creator to label information will help tremendously.  Non-intrusive doesn't mean voluntary; actions for content creation need to be forced and demanded.  Not everything is a record, but many things deserve to be preserved for after action review, lessons learned, and knowledge management training over time.

Many of today's technologies make it far too easy to create content and far too difficult to manage it in perpetuity.  Content creation with longevity in mind is critical for the federal records management professional and for the federal government in general.

Implementing technologies that work together to achieve the longevity goal is paramount. No federal agency can survive on one tool; one tool rarely meets the variety of end user needs or requirements. Discovering and implementing technologies with easy interfaces, open APIs, and purposeful data exchange bases will be most successful in the federal government. Often this equates to open source tools, which are naturally built for easy expansion and integration with other tools.

Point 2:  Manage and Secure Through the Workflow
Very little happens in the federal government without being attached to a workflow.

  • Employee time is a workflow that leads to paychecks.
  • Purchasing small and large good is a workflow that leads to vendor payments and receipt of goods.
  • Asset management is a workflow from asset need to asset receipt to asset long-term disposition.
  • Analytic products are a workflow from inception to review to edit to publish.
  • Meetings are a workflow from establishment to agenda to minutes to action capture and tracking.
  • Federal budget creation is an uber-workflow from planning, programming, budgeting, and execution.
  • Grants management is a workflow from idea submission to review to approval to tracking progress.
  • Citizen services contain many workflows for social security payments, passport processing, visa approvals, small business loans, and so forth.

Introducing solid records management to these macro and micro workflow environments is necessary and important.

The federal government needs tools that understand the intricate workflow processes and seamlessly captures the changes, approvals, and actions for the workflow throughout the entire process-from creation to retirement. A suite of tools-built on open platforms for easy data exchange-is likely to be required for any federal agency. Working through big ERP systems and through small purpose-built systems, workflow foundations can capture information necessary for approvals and for long-term retention.

Equally necessary are workflow tools that maintain data integrity, individual privacy, and agency security. The Federal Government demands absolute security in processing workflows, especially for citizen-based services that span public and private information processing environments.  It's simply not enough to have workflow tools which are fundamentally secure in a private environment. Federal agencies need confidence when exchanging data from a mobile, citizen platform to a private, agency platform.

Point 3:  Archive Responsibly
Fundamental to our form of government is trust.  Trust of our people is fundamental.  Trust by our federal workforce is fundamental. Trust in our records and information is equally fundamental. When the Administration or the Hill or the People want to know what we knew and when we knew it, federal agencies need to be at the ready to provide the truth - with facts and records to support the facts.

The Federal Government and its agencies aren't private institutions. Although there is information that we should not keep, federal agencies should continue to err on the side of caution and keep anything that seems worth keeping. We should be prepared to keep more information and more records than legally required to lend credibility and understanding of historical decisions and outcomes.

Again, we need tools and technologies that make responsible records management and archival easier for everyone. The amount of resources spent by the federal government on review and redaction of federal records is staggering. If we could have technologies to cut the resources just by 10 percent, that would be awesome. Reaching 20 or 30 percent cost reductions would be phenomenal.

Key to reducing manpower in archival, review, and release, is solid creation at that start. At the risk of creating a circular reference, I'll take you back to my initial point of Content Management at Creation.

Summary

  • Federal agencies create more data and content than any of us cares to understand.
  • It's not all useful data and finding our way through the mountains of data to know and keep what's important is a tough job.
  • Securing the data to prevent harmful use and unlawful disclosure needs to be easier for federal agencies.
  • Knowing when a leak is harmful also needs to be easier for federal agencies.
  • Responding to appropriate releases of information-whether through freedom of information act requests or congressional inquiries-shouldn't be as hard as it is today.
  • Guaranteeing the safety and security of private citizen data isn't a desire...it's a demand.
  • The basic needs for federal agencies are:
    • Suites of tools that do a large amount of the content management;
    • Open interfaces and open source tools that allow affordable and extensible add-ons for special purposes;
    • Tools that facilitate reduced complexity for end users and IT departments; and
    • Tools that make a records management professional and an end user's job easier on a day-to-day basis.

More Stories By Jill Tummler Singer

Jill Tummler Singer is CIO for the National Reconnaissance Office (NRO)- which as part of the 16-member Intelligence Community plays a primary role in achieving information superiority for the U.S. Government and Armed Forces. A DoD agency, the NRO is staffed by DoD and CIA personnel. It is funded through the National Reconnaissance Program, part of the National Foreign Intelligence Program.

Prior to joining the NRO, Singer was Deputy CIO at the Central Intelligence Agency (CIA), where she was responsible for ensuring CIA had the information, technology, and infrastructure necessary to effectively execute its missions. Prior to her appointment as Deputy CIO, she served as the Director of the Diplomatic Telecommunications Service (DTS), United States Department of State, and was responsible for global network services to US foreign missions.

Singer has served in several senior leadership positions within the Federal Government. She was the head of Systems Engineering, Architecture, and Planning for CIA's global infrastructure organization. She served as the Director of Architecture and Implementation for the Intelligence Community CIO and pioneered the technology and management concepts that are the basis for multi-agency secure collaboration. She also served within CIA’s Directorate of Science and Technology.

Latest Stories from Big Data Journal
When one expects instantaneous response from video generated on the internet, lots of invisible problems have to be overcome. In his session at 6th Big Data Expo®, Tom Paquin, EVP and Chief Technology Officer at OnLive, to discuss how to overcome these problems. A Silicon Valley veteran, Tom Paquin provides vision, expertise and leadership to the technology research and development effort at OnLive as EVP and Chief Technology Officer. With more than 20 years of management experience at lead...
IoT is still a vague buzzword for many people. In his session at Internet of @ThingsExpo, Mike Kavis, Vice President & Principal Cloud Architect at Cloud Technology Partners, will discuss the business value of IoT that goes far beyond the general public's perception that IoT is all about wearables and home consumer services. The presentation will also discuss how IoT is perceived by investors and how venture capitalist access this space. Other topics to discuss are barriers to success, what is n...
BlueData aims to “democratize Big Data” with its launch of EPIC Enterprise, which it calls “the industry’s first Big Data software to enable enterprises to create a self-service cloud experience on premise.” This self-service private cloud allows enterprises to create 100-node Hadoop and Spark clusters in less than 10 minutes. The company is also offering a Community Edition via free download. We had a few questions for BlueData CEO Kumar Sreekanti about all this, and here's what he had to s...
Cisco on Wedesday announced its intent to acquire privately held Metacloud. Based in Pasadena, Calif., Metacloud deploys and operates private clouds for global organizations with a unique OpenStack-as-a-Service model that delivers and remotely operates production-ready private clouds in a customer's data center. Metacloud's OpenStack-based cloud platform will accelerate Cisco's strategy to build the world's largest global Intercloud, a network of clouds, together with key partners to address cu...
Labor market analytics firm Wanted Analytics recently assessed the market for technology professionals and found that demand for people with proficient levels of Hadoop expertise had skyrocketed by around 33% since last year – it is true, Hadoop is hard technology to master and the labor market is not exactly flooded with an over-abundance of skilled practitioners. Hadoop has been called a foundational technology, rather than ‘just’ a database by some commentators – this almost pushes it towards...
The cloud provides an easy onramp to building and deploying Big Data solutions. Transitioning from initial deployment to large-scale, highly performant operations may not be as easy. In his session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, will discuss the benefits, weaknesses, and performance characteristics of public and bare metal cloud deployments that can help you make the right decisions.
Technology is enabling a new approach to collecting and using data. This approach, commonly referred to as the “Internet of Things” (IoT), enables businesses to use real-time data from all sorts of things including machines, devices and sensors to make better decisions, improve customer service, and lower the risk in the creation of new revenue opportunities. In his session at Internet of @ThingsExpo, Dave Wagstaff, Vice President and Chief Architect at BSQUARE Corporation, will discuss the real...
Where historically app development would require developers to manage device functionality, application environment and application logic, today new platforms are emerging that are IoT focused and arm developers with cloud based connectivity and communications, development, monitoring, management and analytics tools. In her session at Internet of @ThingsExpo, Seema Jethani, Director of Product Management at Basho Technologies, will explore how to rapidly prototype using IoT cloud platforms and c...
Amazon, Google and Facebook are household names in part because of their mastery of Big Data. But what about organizations without billions of dollars to spend on Big Data tools - how can they extract value from their data? Ion Stoica is co-founder and CEO of Databricks, a company working to revolutionize Big Data analysis through the Apache Spark platform. He also serves as a professor of computer science at the University of California, Berkeley. Ion previously co-founded Conviva to commercial...
Due of the rise of Hadoop, many enterprises are now deploying their first small clusters of 10 to 20 servers. At this small scale, the complexity of operating the cluster looks and feels like general data center servers. It is not until the clusters scale, as they inevitably do, when the pain caused by the exponential complexity becomes apparent. We've seen this problem occur time and time again.