Linux Forensic Artifacts: File System, Logs, Cron and Shell History
ext4, /var/log, auditd, journalctl, bash history, cron persistence, SUID, SELinux and plaso timelining for Indian digital forensic examiners working Linux servers.
Last updated:
Linux forensic artifacts are distributed across the file-system layer (ext4 inodes, journal blocks), structured log databases (/var/log/auth.log, /var/log/btmp, systemd-journald), the shell-history file (~/.bash_history), cron drop locations (/etc/cron.d/, /var/spool/cron/crontabs/), and kernel-audit records from auditd. Each layer survives differently: file-system journals cycle in minutes on busy servers, binary wtmp and btmp persist until rotated, and shell history is the first artifact an attacker attempts to wipe. India's CERT-In directions of 28 April 2022 mandate 180-day retention of ICT system logs, which means auditd, journald, and authentication logs are legally expected to exist on Indian-hosted servers for the relevant investigation window.
The interesting Indian digital forensic cases stop being Windows the moment they touch a server. Web hosting providers, payment gateways, government departments, and most of the cloud workloads CERT-In sees in its incident pipeline are Linux. A forensic examiner who can read an ext4 inode, parse auth.log, pull a deleted bash history from journal binary logs, and spot a cron entry that adds a user every Tuesday at 03:00 IST is the examiner the case actually needs. The Windows artifact playbook does not translate. Linux puts its trail in plain-text files (mostly), in binary journal databases (increasingly), and in places like /proc and /sys that exist only while the system is running.
Key takeaways
- Linux servers in Indian production are overwhelmingly ext4, with XFS dominant on RHEL-family RAID arrays and Btrfs visible on some SUSE and Synology NAS evidence.
- Linux puts its forensic trail in plain-text files mostly, in binary journal databases increasingly, and in places like /proc and /sys that exist only while the system is running.
- The CERT-In mandatory log retention regime under the 28 April 2022 directions sets the retention baseline that an examiner must reference when auditing a Linux server.
- A cron entry that adds a user on a recurring schedule is a concrete persistence indicator an examiner should look for during Linux triage.
- Deleted bash history can sometimes be recovered from journal binary logs, making the binary journal a secondary source even when the plain-text history file is wiped.
This topic walks the file-system-to-userspace stack a Linux forensic examiner reads, in the order they typically read it. The framing throughout is Indian, with CERT-In's mandatory log retention regime under the 28 April 2022 directions, the IT Act 2000 read with BSA 2023 Section 63 for admissibility, and the working assumption that the target system is a Debian, Ubuntu, RHEL or CentOS server. macOS sits next to Linux on the BSD branch of the family tree; the macOS plist, Keychain and Time Machine topic is the natural follow-up. For the persistence-and-removal half of the same stack, see data recovery and file carving.
By the end of this topic you will be able to:
- Identify the correct file-system artifact locations (inode, directory entry, journal) for a given Linux file-system type and explain what survives deletion.
- Enumerate the /var/log artifact families by distribution family and describe what each records, including the distinction between wtmp, utmp, and btmp.
- Recover or approximate timestamps for bash history entries when HISTTIMEFORMAT is unset, using auth.log session brackets and systemd journal _COMM entries.
- List every cron persistence location on a Linux server (system crontab, /etc/cron.d/, interval directories, per-user spools, anacron, systemd timers) and explain why per-user crontabs are a recurring blind spot.
- Apply auditd rule queries (ausearch, aureport) and plaso super-timelining to correlate artifacts from multiple layers into a single event sequence.
- Inode
- The ext2/3/4 metadata structure that holds a file's permissions, ownership, MAC times, link count and block pointers. The file name lives in the parent directory entry, not in the inode.
- wtmp / utmp / btmp
- Binary login databases under /var/log and /var/run. wtmp logs successful logins, utmp reflects current sessions, btmp logs failed login attempts. Read with last, w, lastb.
- journalctl
- The query interface to systemd-journald, the binary structured-logging daemon that backs modern Linux distributions. Binary database under /var/log/journal/.
- auditd
- The Linux Audit Daemon. Writes /var/log/audit/audit.log based on rules configured in /etc/audit/audit.rules. Read with ausearch and aureport.
- SUID / SGID
- Set-User-ID and Set-Group-ID permission bits. A SUID binary runs with the owner's privileges regardless of the invoking user. A common persistence and privilege-escalation vector.
- Plaso / log2timeline
- The cross-artifact super-timeline tool. log2timeline.py extracts events from many sources into a .plaso storage file; psort.py renders the timeline to CSV or Elasticsearch.
The file system layer: ext4, inodes and journals
Linux servers in Indian production are overwhelmingly ext4, with XFS dominant on RHEL-family RAID arrays, Btrfs visible on SUSE and on some Synology NAS evidence, and ZFS on FreeNAS-derived storage appliances. Each file system answers the same examiner questions differently: where do MAC times live, how is the journal organised, what happens at deletion, and what survives.
ext4 keeps file metadata in inodes. An inode carries the permissions (mode bits), the owning UID and GID, the file size, the link count, four timestamps (atime, mtime, ctime and crtime since ext4), and an extent tree that locates the file's data blocks on disk. The file name does not live in the inode. It lives in the parent directory entry, which is an array of (inode_number, name) pairs. The practical consequence: when a file is deleted on ext4, the directory entry's inode number is typically zeroed (or the entry is marked unused), but the inode itself often survives with its block pointers cleared. The data blocks themselves are unmodified until reallocated.
The ext4 journal (the jbd2 log) records metadata changes in /dev/sdX1's reserved journal area. The journal is a circular buffer; for a busy server it cycles in minutes. For a quiet workstation, hours of journal history can survive. Tools like extundelete and ext4magic parse the journal to recover recently deleted files even after the inode is cleared, because the journal carries the pre-deletion inode state.
| File system | Inode timestamps | Journal | Deletion behaviour |
|---|---|---|---|
| ext2 | atime, mtime, ctime (second precision) | None | Inode block pointers preserved until reallocated; deleted files highly recoverable. |
| ext3 | atime, mtime, ctime (second precision) | Optional metadata journal (jbd) | Inode block pointers preserved; journal helps recover recent deletes. |
| ext4 | atime, mtime, ctime, crtime (nanosecond) | jbd2 metadata journal, optional data journal | Inode block pointers zeroed on delete; extundelete and ext4magic parse journal. |
| XFS | atime, mtime, ctime (no birth time field by default) | Internal log section | Aggressive delayed allocation; deletion recovery harder, xfs_undelete project. |
| Btrfs | atime, mtime, ctime, otime | Copy-on-write, no traditional journal | Snapshots are the primary recovery surface; btrfs-find-root for catastrophic loss. |
/proc and /sys are not file systems in the on-disk sense. They are virtual interfaces to kernel state. /proc/<pid>/ exposes per-process state (cmdline, exe symlink, fd descriptors, maps, environ); /sys/class/ exposes device-driver state. Neither survives a reboot, but on a live system both are gold. The examiner who can read /proc/<pid>/maps for a suspicious PID gets a list of every memory-mapped library and file the process has open, which is often the fastest route to identifying an unfamiliar payload.
Ownership, permissions and the MAC frameworks
POSIX permissions on Linux are three triplets: owner, group, others, each with read, write and execute. The numeric form (chmod 755) and the symbolic form (rwxr-xr-x) are the same thing. The examiner cares about three extra bits beyond the basics.
- Sticky bit (1xxx): on a directory, restricts file deletion to the file's owner.
/tmpis the canonical example. - SGID (2xxx): on a binary, runs as the group owner; on a directory, files created inside inherit the directory's group.
- SUID (4xxx): on a binary, runs as the file owner regardless of the invoking user. This is the bit that matters most for persistence and privilege escalation.
A baseline Linux system has a known set of SUID binaries (/bin/su, /bin/sudo, /usr/bin/passwd, /usr/bin/chsh, a handful more). Any SUID binary outside that baseline, especially one owned by root with mode 4755, is a finding. The textbook detection is find / -perm -4000 -type f 2>/dev/null for SUID and -perm -2000 for SGID. The 2024 incident response at an Indian PSU's procurement portal turned on a single line in the find output: /usr/local/lib/libcrypt-cache.so.1 set SUID root, masquerading as a libc helper, was the dropper's installed persistence.
ACLs extend POSIX with per-user and per-group entries beyond the owner-group-others triplet. getfacl /path/to/file reads them; setfacl writes them. ACLs are not visible in a normal ls -l (the trailing + is the only hint), so a file with an unexpected ACL is easy to miss in a hurried audit.
SELinux and AppArmor are the Mandatory Access Control frameworks. SELinux is RHEL/CentOS default and labels every file with a user:role:type:level context (ls -Z to view). AppArmor is Debian/Ubuntu default and uses per-binary profiles in /etc/apparmor.d/. A forensic finding: a process running with an unexpected SELinux context, or an AppArmor profile that has been silently disabled (aa-status), is a tampering indicator.
Hidden files on Linux are just files whose names start with a dot. ls skips them; ls -la shows them. The convention is older than the file system itself. Attackers exploit it heavily, especially in user home directories (~/.cache/.config-backup/) and in /tmp/. / (note the trailing space) and /var/tmp/... style names that survive routine cleanups.
The /var/log universe
The Indian server-side forensic workflow lives in /var/log/. Different distributions split it differently, but the artifact families are stable.
/var/log/auth.logon Debian and Ubuntu,/var/log/secureon RHEL and CentOS. Authentication events: ssh logins (successful and failed), sudo invocations, su, pam events. This is the first file the examiner reads./var/log/syslogon Debian/Ubuntu,/var/log/messageson RHEL/CentOS. General system events that the kernel and daemons emit. Drivers loading, mounts happening, services starting and stopping./var/log/wtmp: binary database of successful login and logout events. Read withlast. Records the username, terminal, source IP (for remote logins), and the duration of the session./var/log/btmp: binary database of failed login attempts. Read withlastb. The Indian web hosting reality is thatlastbon any internet-facing server shows tens of thousands of ssh brute-force attempts; the signal is in the source IP geography and timing patterns./var/log/lastlog: binary database of the single most recent login per user. Read withlastlog. Useful for "when did root last log in directly" questions./var/log/kern.logor kernel events inside/var/log/messages: OOM kills, hardware errors, USB plug events./var/log/audit/audit.logifauditdis running: rule-driven kernel-level audit records./var/log/journal/<machine-id>/system.journal: the systemd-journald binary database, on modern distributions the canonical location for everything except the legacy text files above.
systemd-journald is the framework that ate most of the rest. The journal database under /var/log/journal/<machine-id>/ is binary, indexed, and queryable with rich filters. journalctl --since "2026-05-10 00:00" --until "2026-05-12 23:59" slices by time. journalctl _SYSTEMD_UNIT=sshd.service filters by service. journalctl -k gives kernel ring buffer entries. journalctl _UID=1001 gives every entry attributed to a specific user. The forensic examiner extracts the binary journal directly from the image and runs journalctl --file /path/to/system.journal on the analysis host.
Shell history, cron and the persistence layer
Shell history is the file the attacker tries to wipe. ~/.bash_history is the default for bash; ~/.zsh_history for zsh. The environment variables that govern them are worth memorising:
HISTSIZE: maximum number of commands kept in memory for the current session.HISTFILESIZE: maximum number of commands kept in the history file on disk.HISTFILE: the file path. Setting it to/dev/nullis the classic anti-forensic trick.HISTTIMEFORMAT: when set, prefixes each history entry with a timestamp. When unset, history entries are bare commands with no time information.
The forensic reality on Indian production servers is that HISTTIMEFORMAT is usually unset, which means the bash history is a flat list of commands with no times attached. Examiners deal with this in three ways. First, cross-reference against /var/log/auth.log to bracket the user's session windows: a command in the history at position 47 is between the login at 14:02 and the logout at 14:55. Second, look for .bash_history.<timestamp> rotated files left behind by misconfigured rotation. Third, parse the systemd journal for _COMM=bash and _COMM=sudo entries, which carry timestamps even when bash history does not.
Cron is the persistence vector. Linux cron has more places to hide than examiners typically check.
- System crontabRead /etc/crontab. Single file, system-wide jobs, runs as the user named in each line. Most legitimate distribution maintenance lives here.
- Per-package cron dropsList /etc/cron.d/. One file per package, same format as /etc/crontab. Attackers drop persistence here under innocuous names like .cron-update or libsystemd-rotate.
- Interval directoriesInspect /etc/cron.hourly/, /etc/cron.daily/, /etc/cron.weekly/, /etc/cron.monthly/. Scripts dropped here run at the named cadence under run-parts.
- Per-user crontabsList /var/spool/cron/crontabs/ on Debian or /var/spool/cron/ on RHEL. One file per user with a personal crontab. crontab -l as that user shows the same. Attackers commonly add to root's user crontab because it is less audited than /etc/crontab.
- AnacronRead /etc/anacrontab. Anacron handles jobs on systems that are not always on. Less commonly abused but worth checking.
- systemd timersModern persistence increasingly uses systemd timers in /etc/systemd/system/*.timer rather than cron at all. systemctl list-timers --all enumerates them.
A real Indian incident in 2024 at a media-streaming startup turned on a cron entry in /var/spool/cron/crontabs/www-data that exfiltrated /etc/passwd and /etc/shadow to an attacker-controlled host every six hours. The system crontab had been audited and was clean; the per-user crontab for the www-data web user had been overlooked by the in-house team. CERT-In's post-incident note (CIAD-2024-0144) flagged the per-user crontab location as a recurring blind spot.
auditd, temp files and the rootkit detection layer
auditd is the Linux Audit Daemon. It runs as a userspace process that hooks into the kernel audit subsystem and writes records to /var/log/audit/audit.log. Rules live in /etc/audit/audit.rules. A baseline production rule set typically watches /etc/passwd, /etc/shadow, /etc/sudoers, the cron drop directories, and execve syscalls by privileged users.
-w /etc/passwd -p wa -k identity
-w /etc/shadow -p wa -k identity
-w /etc/sudoers -p wa -k privilege
-a always,exit -F arch=b64 -S execve -F euid=0 -k root-execausearch -k identity --start "2026-05-10 00:00:00" queries records matching the identity rule key. aureport --summary produces a high-level summary across the log. The Indian CERT-In log retention directions of 28 April 2022 effectively require Indian VPS and data-centre operators to keep audit and authentication logs for 180 days; this is the regulation that turned auditd from a nice-to-have into a deployed-everywhere baseline on Indian-hosted Linux servers.
Temp directories are forensic-significant for two reasons. /tmp and /var/tmp are world-writable, and attackers use them as staging areas. The difference: /tmp is cleared on reboot on most distributions (via systemd-tmpfiles); /var/tmp persists across reboots. A payload dropped in /var/tmp survives a reboot and is easier to spot post-incident than one in /tmp that has been swept.
/etc/fstab lists every persistent mount. The mount command shows what is mounted now. A mismatch between the two is a finding: a device that is mounted but not in fstab was mounted by hand, possibly by an attacker; a device in fstab but not mounted may have been removed. readlink -f resolves symbolic links to their target; chains of symbolic links in user home directories or /usr/local/ are a hiding pattern.
Linux-specific malware artifacts cluster around two techniques. First, kernel-mode rootkits (loadable kernel modules under /lib/modules/<kernel>/) hide files, processes and network sockets by hooking syscalls. Detection: lsmod against a known-clean baseline, kmod list for module metadata, and offline disk-image scans because a running rootkit can lie to live tooling. Second, LD_PRELOAD user-space hooking, where a shared library named in /etc/ld.so.preload or in the LD_PRELOAD environment variable is injected into every dynamically linked process; the library intercepts readdir, open, connect and friends. Detection: strace against a known-clean binary, comparison of ls against find (different syscall patterns; a hook that fools one may not fool the other), and tools like rkhunter and chkrootkit that look for the common signatures.

Toolchain, plaso timelining and the Indian context
Casework analysis integrates artifacts across all layers simultaneously. Three tool families do the heavy lifting on an Indian SFSL Linux investigation.
Sleuth Kit and Autopsy. Sleuth Kit's command-line tools (mmls, fls, icat, istat, blkls, tsk_recover) parse the on-disk image without mounting it, which is the right move for evidence integrity. mmls image.dd lists partitions; fls -r -m / image.dd offset=... walks the file system and emits a bodyfile suitable for timelining; icat image.dd <inode> extracts a file by inode number. Autopsy is the GUI layer that wraps the same tooling and adds keyword search, hash lookup, and report generation. Indian academic labs at NFSU Gandhinagar and at LNJN NICFS Delhi teach Autopsy as the default GUI.
log2timeline / plaso. Plaso ingests every supported artifact format from a mounted image or a directory and writes a .plaso storage file. psort.py renders the storage file to CSV, Elasticsearch, or any other supported sink. The output is the super-timeline: one row per event, sortable by time, joinable against any other CSV. For a Linux server image, plaso pulls bash history, syslog, auth.log, wtmp, journald binaries, auditd records, cron drops, and file system MAC times into a single timeline. The CSM-side analogue is the chain of custody workflow that wraps the analytical result; the digital chain begins at imaging and ends at the BSA 2023 Section 63 certificate.
Mac-robber. A small but useful tool that walks a mounted file system and emits a bodyfile for mactime. Faster than fls when the image is already mounted read-only for triage. Often used as the first-pass timelining step.
Linux dominates the server side of the Indian internet: BSNL infrastructure, payment-gateway endpoints, NIC-hosted government portals, and the bulk of cloud workloads in AWS Mumbai, Azure Pune and GCP Delhi regions. CERT-In's 28 April 2022 directions require data-centre and VPS providers to maintain ICT system logs for 180 days, with mandatory incident reporting within six hours of detection. The effect of this regulation on forensic practice is direct: the auth.log, journald and auditd records the examiner expects to find usually exist for the relevant time window, and the absence of them is itself a finding (and a CERT-In compliance breach by the operator). The case files coming out of the I4C (Indian Cybercrime Coordination Centre) pipeline routinely include auditd exports and journald .journal files as evidence annexures, certified under BSA 2023 Section 63.
On an ext4 file system, which structure carries a file's permissions, MAC times, link count and block pointers?
Frequently asked questions
Why do Indian SFSLs see more Linux than Windows in cybercrime cases?
Bash history with no timestamps: what can a forensic examiner actually do with it?
How does CERT-In's log retention regulation affect Linux forensic practice?
What is the difference between wtmp, utmp and btmp?
Is journald replacing /var/log/auth.log and /var/log/syslog?
Which tool is the standard cross-artifact timelining choice for a Linux forensic examiner?
How do I detect an LD_PRELOAD rootkit on a live Linux system I cannot reboot?
Test yourself on Digital Forensics with free, timed mocks.
Practice Digital Forensics questionsSpotted an error in this page? Report a correction or read our editorial standards.