e-Discovery Core Glosssary: D (part 1)


D………………………………………………………………………………………………….....

Data Any information stored on a computer. All software is divided into two general categories: data and programs. Programs are collections of instructions for manipulating data. In database management systems, data files are the files that store the database information. Other files, such as index files and data dictionaries, store administrative information, known as metadata.

Data Categorization The categorization and sorting of ESI such as foldering by "concept," content, subject, taxonomy, etc. through the use of technology such as search and retrieval software or artificial intelligence to facilitate review and analysis.

Data Collection See Harvesting.

Data Extraction The process of retrieving data from documents (hard copy or digital).

Data Formats The organization of information for display, storage or printing. Data is sometimes maintained in certain common formats so that it can be used by various programs, which may only work with data in a particular format, e.g. PDF, html.

Data Mining Data mining generally refers to knowledge discovery in databases (structured data); often techniques for extracting summaries and reports from databases and data sets. In the context of electronic discovery, this term often refers to the processes used to cull through a collection of ESI to extract evidence for production or presentation in an investigation or in litigation.

Data Verification Assessment of data to ensure it has not been modified. The most common method of verification is hash coding by some method such as MD5.

DBMS (Database Management System) A software system used to access and retrieve data stored in a database.

Database In electronic records, a database is a set of data elements consisting of at least one file, or of a group of integrated files, usually stored in one location and made available to several users. Databases are sometimes classified according to their organizational approach, with the most prevalent approach being the relational database a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways.

Decompression To expand or restore compressed data back to its original size and format.

Decryption Transformation of encrypted data back to original form.

DeDuplication ("deduping") The process of comparing electronic records based on their characteristics and removing or marking duplicate records within the data set. The definition of "duplicate records" should be agreed upon, i.e., whether an exact copy from a different location (such as a different mailbox, server tapes, etc.) is considered to be a duplicate. Deduplication can be selective, depending on the agreed upon criteria.

DeFragment ("defrag") Use of a computer utility to reorganize files so they are more contiguous on a hard drive or other storage medium, if the files or parts thereof have become fragmented and scattered in various locations within the storage medium in the course of normal computer operations. Used to optimize the operation of the computer, it will overwrite information in unallocated space.

Deletion The process whereby data is removed from active files and other data storage structures on computers and rendered inaccessible except through the use of special data recovery tools designed to recover deleted data. Deletion occurs on several levels in modern computer systems: (a) File level deletion renders the file inaccessible to the operating system and normal application programs and marks the storage space occupied by the file´s directory entry and contents as free and available to reuse for data storage, (b) Record level deletion occurs when a record is rendered inaccessible to a database management system (DBMS) (usually marking the record storage space as available for reuse by the DBMS, although in some cases the space is never reused until the database is compacted) and is also characteristic of many email systems (c) Byte level deletion occurs when text or other information is deleted from the file content (such as the deletion of text from a word processing file); such deletion may render the deleted data inaccessible to the application intended to be used in processing the file, but may not actually remove the data from the file´s content until a process such as compaction or rewriting of the file causes the deleted data to be overwritten.

Deshading Eliminating shaded areas to render images more easily recognizable by OCR. Deshading software typically searches for areas with a regular pattern of tiny dots.

Despeckling Eliminating secluded speckles from an image file; speckles often develop when a document is scanned or faxed.

Digital Information stored as a string of ones and zeros (numeric).

Digital Fingerprint A fixed length hash code that uniquely represents the binary content of a file.

Digital Signature A way to ensure the identity of the sender, utilizing public key cryptography and working in juxtaposition with certificates.

Digitize The process of converting an analog value into a digital representation.

Directory A simulated file folder or container used to organize files and directories in a hierarchical or treelike structure. UNIX and DOS use the term "directory," while Mac and Windows use the term "folder."

Dirty Text OCR output reflecting text as read by the OCR engine(s) with no clean up

Discovery The process of identifying, locating, securing, and producing all materials for the purpose of obtaining evidence for utilization in the legal process. Also used to describe the process of reviewing all materials that may be relevant to the issues (“Responsive Documents”) at hand and/or that may need to be disclosed to other parties, and of evaluating evidence to prove or disprove facts, theories or allegations.



« Back to News