标签 osquery 下的文章

osquery源码解读之分析process_open_socket

说明

上篇文章主要是对shell_history的实现进行了分析。通过分析可以发现,osquery良好的设计使得源码简单易读。shell_history的整体实现也比较简单,通过读取并解析.bash_history中的内容,获得用户输入的历史命令。本文分析的是process_open_sockets,相比较而言实现更加复杂,对Linux也需要有更深的了解。

使用说明

首先查看process_open_sockets表的定义:

table_name("process_open_sockets")
description("Processes which have open network sockets on the system.")
schema([
    Column("pid", INTEGER, "Process (or thread) ID", index=True),
    Column("fd", BIGINT, "Socket file descriptor number"),
    Column("socket", BIGINT, "Socket handle or inode number"),
    Column("family", INTEGER, "Network protocol (IPv4, IPv6)"),
    Column("protocol", INTEGER, "Transport protocol (TCP/UDP)"),
    Column("local_address", TEXT, "Socket local address"),
    Column("remote_address", TEXT, "Socket remote address"),
    Column("local_port", INTEGER, "Socket local port"),
    Column("remote_port", INTEGER, "Socket remote port"),
    Column("path", TEXT, "For UNIX sockets (family=AF_UNIX), the domain path"),
])
extended_schema(lambda: LINUX() or DARWIN(), [
    Column("state", TEXT, "TCP socket state"),
])
extended_schema(LINUX, [
    Column("net_namespace", TEXT, "The inode number of the network namespace"),
])
implementation("system/process_open_sockets@genOpenSockets")
examples([
  "select * from process_open_sockets where pid = 1",
])

其中有几个列名需要说明一下:

  • fd,表示文件描述符
  • socket,进行网络通讯时,socket通信对应的inode number
  • family,表示是IPv4/IPv6,最后的结果是以数字的方式展示
  • protocol,表示是TCP/UDP。

我们进行一个简单的反弹shell的操作,然后使用查询process_open_sockets表的信息。

osquery> select pos.*,p.cwd,p.cmdline from process_open_sockets pos left join processes p where pos.family=2 and pos.pid=p.pid and net_namespace<>0;
+-------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+-----------------+-----------+
| pid   | fd | socket   | family | protocol | local_address | remote_address | local_port | remote_port | path | state       | net_namespace | cwd             | cmdline   |
+-------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+-----------------+-----------+
| 37272 | 15 | 52319299 | 2      | 6        | 192.168.2.142 | 172.22.0.176   | 43522      | 9091        |      | ESTABLISHED | 4026531956    | /home/xingjun   | osqueryi  |
| 91155 | 2  | 56651533 | 2      | 6        | 192.168.2.142 | 192.168.2.150  | 53486      | 8888        |      | ESTABLISHED | 4026531956    | /proc/79036/net | /bin/bash |
+-------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+-----------------+-----------+

process_open_sockets表的实现是位于osquery/tables/networking/linux/process_open_sockets.cpp中。

分析

process_open_sockets的实现全部是在QueryData genOpenSockets(QueryContext &context)一个方法中。

官方给出的分析步骤是:

Data for this table is fetched from 3 different sources and correlated.

1.Collect all sockets associated with each pid by going through all files under /proc/<pid>/fd and search for links of the type socket:[<inode>]. Extract the inode and fd (filename) and index it by inode number. The inode can then be used to correlate pid and fd with the socket information collected on step 3. The map generated in this step will only contain sockets associated with pids in the list, so it will also be used to filter the sockets later if pid_filter is set.

2.Collect the inode for the network namespace associated with each pid. Every time a new namespace is found execute step 3 to get socket basic information.

3.Collect basic socket information for all sockets under a specifc network namespace. This is done by reading through files under /proc/<pid>/net for the first pid we find in a certain namespace. Notice this will collect information for all sockets on the namespace not only for sockets associated with the specific pid, therefore only needs to be run once. From this step we collect the inodes of each of the sockets, and will use that to correlate the socket information with the information collect on steps 1 and 2.

其实大致步骤就是:

  1. 收集进程所对应的fd信息,尤其是socketinode信息;
  2. 收集进程的namespaceinode信息;
  3. 读取/proc/<pid>/net中的信息,与第一步中的socketinode信息进行比对,找出pid所对应的网络连接信息。

为了方便说明,我对整个函数的代码进行切割,分步说明。

获取pid信息

std::set <std::string> pids;
if (context.constraints["pid"].exists(EQUALS)) {
    pids = context.constraints["pid"].getAll(EQUALS);
}

bool pid_filter = !(pids.empty() ||
                    std::find(pids.begin(), pids.end(), "-1") != pids.end());

if (!pid_filter) {
    pids.clear();
    status = osquery::procProcesses(pids);
    if (!status.ok()) {
        VLOG(1) << "Failed to acquire pid list: " << status.what();
        return results;
    }
}
  • 前面的context.constraints["pid"].exists(EQUALS)pid_filter为了判断在SQL语句中是否存在where子句以此拿到选择的pid
  • 调用status = osquery::procProcesses(pids);拿到对应的PID信息。

跟踪进入到osquery/filesystem/linux/proc.cpp:procProcesses(std::set<std::string>& processes):

Status procProcesses(std::set<std::string>& processes) {
  auto callback = [](const std::string& pid,
                     std::set<std::string>& _processes) -> bool {
    _processes.insert(pid);
    return true;
  };

  return procEnumerateProcesses<decltype(processes)>(processes, callback);
}

继续跟踪进入到osquery/filesystem/linux/proc.h:procEnumerateProcesses(UserData& user_data,bool (*callback)(const std::string&, UserData&))

const std::string kLinuxProcPath = "/proc";
.....
template<typename UserData>
Status procEnumerateProcesses(UserData &user_data,bool (*callback)(const std::string &, UserData &)) {
    boost::filesystem::directory_iterator it(kLinuxProcPath), end;

    try {
        for (; it != end; ++it) {
            if (!boost::filesystem::is_directory(it->status())) {
                continue;
            }

            // See #792: std::regex is incomplete until GCC 4.9
            const auto &pid = it->path().leaf().string();
            if (std::atoll(pid.data()) <= 0) {
                continue;
            }

            bool ret = callback(pid, user_data);
            if (ret == false) {
                break;
            }
        }
    } catch (const boost::filesystem::filesystem_error &e) {
        VLOG(1) << "Exception iterating Linux processes: " << e.what();
        return Status(1, e.what());
    }

    return Status(0);
}
  • boost::filesystem::directory_iterator it(kLinuxProcPath), end;遍历/proc目录下面所有的文件,
  • const auto &pid = it->path().leaf().string();..; bool ret = callback(pid, user_data);,通过it->path().leaf().string()判断是否为数字,之后调用bool ret = callback(pid, user_data);
  • callback方法_processes.insert(pid);return true;将查询到的pid全部记录到user_data中。

以一个反弹shell的例子为例,使用osqueryi查询到的信息如下:

osquery> select * from process_open_sockets where pid=14960; 
+-------+----+--------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+
| pid   | fd | socket | family | protocol | local_address | remote_address | local_port | remote_port | path | state       | net_namespace |
+-------+----+--------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+
| 14960 | 2  | 307410 | 2      | 6        | 192.168.2.156 | 192.168.2.145  | 51118      | 8888        |      | ESTABLISHED | 4026531956    |
+-------+----+--------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+

获取进程对应的pid和fd信息

/* Use a set to record the namespaces already processed */
std::set <ino_t> netns_list;
SocketInodeToProcessInfoMap inode_proc_map;
SocketInfoList socket_list;
for (const auto &pid : pids) {
    /* Step 1 */
    status = procGetSocketInodeToProcessInfoMap(pid, inode_proc_map);
    if (!status.ok()) {
        VLOG(1) << "Results for process_open_sockets might be incomplete. Failed "
                    "to acquire socket inode to process map for pid "
                << pid << ": " << status.what();
    }

在拿到所有的需要查询的pid信息之后,调用status = procGetSocketInodeToProcessInfoMap(pid, inode_proc_map);,顾名思义就是用于获取进程所对应的socket inode编号。进入到osquery/filesystem/linux/proc.cpp:procGetSocketInodeToProcessInfoMap()中:

Status procGetSocketInodeToProcessInfoMap(const std::string &pid,SocketInodeToProcessInfoMap &result) {
    auto callback = [](const std::string &_pid,
                        const std::string &fd,
                        const std::string &link,
                        SocketInodeToProcessInfoMap &_result) -> bool {
        /* We only care about sockets. But there will be other descriptors. */
        if (link.find("socket:[") != 0) {
            return true;
        }

        std::string inode = link.substr(8, link.size() - 9);
        _result[inode] = {_pid, fd};
        return true;
    };

    return procEnumerateProcessDescriptors<decltype(result)>(
            pid, result, callback);
}

其中的auto callback定义的是一个回调函数,进入到procEnumerateProcessDescriptors()中分析:

const std::string kLinuxProcPath = "/proc";
....
template<typename UserData>
Status procEnumerateProcessDescriptors(const std::string &pid,
                                        UserData &user_data,
                                        bool (*callback)(const std::string &pid,
                                                        const std::string &fd,
                                                        const std::string &link,
                                                        UserData &user_data)) {
    std::string descriptors_path = kLinuxProcPath + "/" + pid + "/fd";

    try {
        boost::filesystem::directory_iterator it(descriptors_path), end;

        for (; it != end; ++it) {
            auto fd = it->path().leaf().string();

            std::string link;
            Status status = procReadDescriptor(pid, fd, link);
            if (!status.ok()) {
                VLOG(1) << "Failed to read the link for file descriptor " << fd
                        << " of pid " << pid << ". Data might be incomplete.";
            }

            bool ret = callback(pid, fd, link, user_data);
            if (ret == false) {
                break;
            }
        }
    } catch (boost::filesystem::filesystem_error &e) {
        VLOG(1) << "Exception iterating process file descriptors: " << e.what();
        return Status(1, e.what());
    }

    return Status(0);
}

这个代码写得十分清晰。

1.遍历/proc/pid/fd,拿到所有的文件描述符。在本例中即为/proc/14960/fd

1.jpg

2.回调bool ret = callback(pid, fd, link, user_data);,即之前在procGetSocketInodeToProcessInfoMap中定义的:

auto callback = [](const std::string &_pid,
                    const std::string &fd,
                    const std::string &link,
                    SocketInodeToProcessInfoMap &_result) -> bool {
    /* We only care about sockets. But there will be other descriptors. */
    if (link.find("socket:[") != 0) {
        return true;
    }

    std::string inode = link.substr(8, link.size() - 9);
    _result[inode] = {_pid, fd};
    return true;
};

代码也十分地简单,拿到fd所对应的link,检查是否存在socket:[,如果存在获取对应的inode。由于查询的是process_open_sockets,所以我们仅仅只关心存在socket的link,在本例中就是307410。最终在SocketInodeToProcessInfoMap中的结构就是_result[inode] = {_pid, fd};。以inode作为key,包含了pidfd的信息。

获取进程对应的ns信息

在上一步status = procGetSocketInodeToProcessInfoMap(pid, inode_proc_map);执行完毕之后,得到_result[inode] = {_pid, fd};。将inodepidfd进行了关联。接下里就是解析进程对应的ns信息。

ino_t ns;
ProcessNamespaceList namespaces;
status = procGetProcessNamespaces(pid, namespaces, {"net"});
if (status.ok()) {
    ns = namespaces["net"];
} else {
    /* If namespaces are not available we allways set ns to 0 and step 3 will
        * run once for the first pid in the list.
        */
    ns = 0;
    VLOG(1) << "Results for the process_open_sockets might be incomplete."
                "Failed to acquire network namespace information for process "
                "with pid "
            << pid << ": " << status.what();
}
跟踪进入到`status = procGetProcessNamespaces(pid, namespaces, {"net"});`,进入到`osquery/filesystem/linux/proc.cpp:procGetProcessNamespaces()`
const std::string kLinuxProcPath = "/proc";
...
Status procGetProcessNamespaces(const std::string &process_id,ProcessNamespaceList &namespace_list,std::vector <std::string> namespaces) {
    namespace_list.clear();
    if (namespaces.empty()) {
        namespaces = kUserNamespaceList;
    }
    auto process_namespace_root = kLinuxProcPath + "/" + process_id + "/ns";
    for (const auto &namespace_name : namespaces) {
        ino_t namespace_inode;
        auto status = procGetNamespaceInode(namespace_inode, namespace_name, process_namespace_root);
        if (!status.ok()) {
            continue;
        }
        namespace_list[namespace_name] = namespace_inode;
    }
    return Status(0, "OK");
}

遍历const auto &namespace_name : namespaces,之后进入到process_namespace_root中,调用procGetNamespaceInode(namespace_inode, namespace_name, process_namespace_root);进行查询。在本例中namespaces{"net"},process_namespace_root/proc/14960/ns

分析procGetNamespaceInode(namespace_inode, namespace_name, process_namespace_root):

Status procGetNamespaceInode(ino_t &inode,const std::string &namespace_name,const std::string &process_namespace_root) {
    inode = 0;
    auto path = process_namespace_root + "/" + namespace_name;
    char link_destination[PATH_MAX] = {};
    auto link_dest_length = readlink(path.data(), link_destination, PATH_MAX - 1);
    if (link_dest_length < 0) {
        return Status(1, "Failed to retrieve the inode for namespace " + path);
    }

    // The link destination must be in the following form: namespace:[inode]
    if (std::strncmp(link_destination,
                        namespace_name.data(),
                        namespace_name.size()) != 0 ||
        std::strncmp(link_destination + namespace_name.size(), ":[", 2) != 0) {
        return Status(1, "Invalid descriptor for namespace " + path);
    }

    // Parse the inode part of the string; strtoull should return us a pointer
    // to the closing square bracket
    const char *inode_string_ptr = link_destination + namespace_name.size() + 2;
    char *square_bracket_ptr = nullptr;

    inode = static_cast<ino_t>(
            std::strtoull(inode_string_ptr, &square_bracket_ptr, 10));
    if (inode == 0 || square_bracket_ptr == nullptr ||
        *square_bracket_ptr != ']') {
        return Status(1, "Invalid inode value in descriptor for namespace " + path);
    }

    return Status(0, "OK");
}

根据procGetProcessNamespaces()中定义的相关变量,得到path是/proc/pid/ns/net,在本例中是/proc/14960/ns/net。通过inode = static_cast<ino_t>(std::strtoull(inode_string_ptr, &square_bracket_ptr, 10));,解析/proc/pid/ns/net所对应的inode。在本例中:

2.jpg

所以取到的inode4026531956。之后在procGetProcessNamespaces()中执行namespace_list[namespace_name] = namespace_inode;,所以namespace_list['net']=4026531956。最终ns = namespaces["net"];,所以得到的ns=4026531956

解析进程的net信息

// Linux proc protocol define to net stats file name.
const std::map<int, std::string> kLinuxProtocolNames = {
        {IPPROTO_ICMP,    "icmp"},
        {IPPROTO_TCP,     "tcp"},
        {IPPROTO_UDP,     "udp"},
        {IPPROTO_UDPLITE, "udplite"},
        {IPPROTO_RAW,     "raw"},
};
...
if (netns_list.count(ns) == 0) {
    netns_list.insert(ns);

    /* Step 3 */
    for (const auto &pair : kLinuxProtocolNames) {
        status = procGetSocketList(AF_INET, pair.first, ns, pid, socket_list);
        if (!status.ok()) {
            VLOG(1)
                    << "Results for process_open_sockets might be incomplete. Failed "
                        "to acquire basic socket information for AF_INET "
                    << pair.second << ": " << status.what();
        }

        status = procGetSocketList(AF_INET6, pair.first, ns, pid, socket_list);
        if (!status.ok()) {
            VLOG(1)
                    << "Results for process_open_sockets might be incomplete. Failed "
                        "to acquire basic socket information for AF_INET6 "
                    << pair.second << ": " << status.what();
        }
    }
    status = procGetSocketList(AF_UNIX, IPPROTO_IP, ns, pid, socket_list);
    if (!status.ok()) {
        VLOG(1)
                << "Results for process_open_sockets might be incomplete. Failed "
                    "to acquire basic socket information for AF_UNIX: "
                << status.what();
    }
}

对于icmp/tcp/udp/udplite/raw会调用status = procGetSocketList(AF_INET|AF_INET6|AF_UNIX, pair.first, ns, pid, socket_list);。我们这里仅仅以procGetSocketList(AF_INET, pair.first, ns, pid, socket_list);进行说明(其中的ns就是4026531956)。

Status procGetSocketList(int family, int protocol,ino_t net_ns,const std::string &pid, SocketInfoList &result) {
    std::string path = kLinuxProcPath + "/" + pid + "/net/";

    switch (family) {
        case AF_INET:
            if (kLinuxProtocolNames.count(protocol) == 0) {
                return Status(1,"Invalid family " + std::to_string(protocol) +" for AF_INET familiy");
            } else {
                path += kLinuxProtocolNames.at(protocol);
            }
            break;

        case AF_INET6:
            if (kLinuxProtocolNames.count(protocol) == 0) {
                return Status(1,"Invalid protocol " + std::to_string(protocol) +" for AF_INET6 familiy");
            } else {
                path += kLinuxProtocolNames.at(protocol) + "6";
            }
            break;

        case AF_UNIX:
            if (protocol != IPPROTO_IP) {
                return Status(1,
                                "Invalid protocol " + std::to_string(protocol) +
                                " for AF_UNIX familiy");
            } else {
                path += "unix";
            }

            break;

        default:
            return Status(1, "Invalid family " + std::to_string(family));
    }

    std::string content;
    if (!osquery::readFile(path, content).ok()) {
        return Status(1, "Could not open socket information from " + path);
    }

    Status status(0);
    switch (family) {
        case AF_INET:
        case AF_INET6:
            status = procGetSocketListInet(family, protocol, net_ns, path, content, result);
            break;

        case AF_UNIX:
            status = procGetSocketListUnix(net_ns, path, content, result);
            break;
    }

    return status;
}

由于我们的传参是family=AF_INET,protocol=tcp,net_ns=4026531956,pid=14960。执行流程如下:

1.path += kLinuxProtocolNames.at(protocol);,得到path是/proc/14960/net/tcp

2.osquery::readFile(path, content).ok(),读取文件内容,即/proc/14960/net/tcp所对应的文件内容。在本例中是:

sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
0: 00000000:1538 00000000:0000 0A 00000000:00000000 00:00000000 00000000    26        0 26488 1 ffff912c69c21740 100 0 0 10 0
1: 0100007F:0019 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 28721 1 ffff912c69c23640 100 0 0 10 0
2: 00000000:01BB 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 27739 1 ffff912c69c21f00 100 0 0 10 0
3: 0100007F:18EB 00000000:0000 0A 00000000:00000000 00:00000000 00000000   988        0 25611 1 ffff912c69c207c0 100 0 0 10 0
4: 00000000:0050 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 27737 1 ffff912c69c226c0 100 0 0 10 0
5: 017AA8C0:0035 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 29031 1 ffff912c69c23e00 100 0 0 10 0
6: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 25754 1 ffff912c69c20f80 100 0 0 10 0
7: 0100007F:0277 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 25590 1 ffff912c69c20000 100 0 0 10 0
8: 9C02A8C0:C7AE 9102A8C0:22B8 01 00000000:00000000 00:00000000 00000000  1000

3.执行procGetSocketListInet(family, protocol, net_ns, path, content, result);

分析

static Status procGetSocketListInet(int family,int protocol,ino_t net_ns,const std::string &path,const std::string &content,SocketInfoList &result) {
    // The system's socket information is tokenized by line.
    bool header = true;
    for (const auto &line : osquery::split(content, "\n")) {
        if (header) {
            if (line.find("sl") != 0 && line.find("sk") != 0) {
                return Status(1, std::string("Invalid file header for ") + path);
            }
            header = false;
            continue;
        }

        // The socket information is tokenized by spaces, each a field.
        auto fields = osquery::split(line, " ");
        if (fields.size() < 10) {
            VLOG(1) << "Invalid socket descriptor found: '" << line
                    << "'. Skipping this entry";
            continue;
        }

        // Two of the fields are the local/remote address/port pairs.
        auto locals = osquery::split(fields[1], ":");
        auto remotes = osquery::split(fields[2], ":");

        if (locals.size() != 2 || remotes.size() != 2) {
            VLOG(1) << "Invalid socket descriptor found: '" << line
                    << "'. Skipping this entry";
            continue;
        }

        SocketInfo socket_info = {};
        socket_info.socket = fields[9];
        socket_info.net_ns = net_ns;
        socket_info.family = family;
        socket_info.protocol = protocol;
        socket_info.local_address = procDecodeAddressFromHex(locals[0], family);
        socket_info.local_port = procDecodePortFromHex(locals[1]);
        socket_info.remote_address = procDecodeAddressFromHex(remotes[0], family);
        socket_info.remote_port = procDecodePortFromHex(remotes[1]);

        if (protocol == IPPROTO_TCP) {
            char *null_terminator_ptr = nullptr;
            auto integer_socket_state =
                    std::strtoull(fields[3].data(), &null_terminator_ptr, 16);
            if (integer_socket_state == 0 ||
                integer_socket_state >= tcp_states.size() ||
                null_terminator_ptr == nullptr || *null_terminator_ptr != 0) {
                socket_info.state = "UNKNOWN";
            } else {
                socket_info.state = tcp_states[integer_socket_state];
            }
        }

        result.push_back(std::move(socket_info));
    }

    return Status(0);
}

整个执行流程如下:

1.const auto &line : osquery::split(content, "\n");.. auto fields = osquery::split(line, " ");解析文件,读取每一行的内容。对每一行采用空格分割;

2.解析信息

SocketInfo socket_info = {};
socket_info.socket = fields[9];
socket_info.net_ns = net_ns;
socket_info.family = family;
socket_info.protocol = protocol;
socket_info.local_address = procDecodeAddressFromHex(locals[0], family);
socket_info.local_port = procDecodePortFromHex(locals[1]);
socket_info.remote_address = procDecodeAddressFromHex(remotes[0], family);
socket_info.remote_port = procDecodePortFromHex(remotes[1]);

解析/proc/14960/net/tcp文件中的每一行,分别填充至socket_info结构中。但是在/proc/14960/net/tcp并不是所有的信息都是我们需要的,我们还需要对信息进行过滤。可以看到最后一条的inode307410才是我们需要的。

获取进程连接信息

将解析完毕/proc/14960/net/tcp获取socket_info之后,继续执行genOpenSockets()中的代码。

    auto proc_it = inode_proc_map.find(info.socket);
    if (proc_it != inode_proc_map.end()) {
        r["pid"] = proc_it->second.pid;
        r["fd"] = proc_it->second.fd;
    } else if (!pid_filter) {
        r["pid"] = "-1";
        r["fd"] = "-1";
    } else {
        /* If we're filtering by pid we only care about sockets associated with
            * pids on the list.*/
        continue;
    }

    r["socket"] = info.socket;
    r["family"] = std::to_string(info.family);
    r["protocol"] = std::to_string(info.protocol);
    r["local_address"] = info.local_address;
    r["local_port"] = std::to_string(info.local_port);
    r["remote_address"] = info.remote_address;
    r["remote_port"] = std::to_string(info.remote_port);
    r["path"] = info.unix_socket_path;
    r["state"] = info.state;
    r["net_namespace"] = std::to_string(info.net_ns);

    results.push_back(std::move(r));
}

其中关键代码是:

auto proc_it = inode_proc_map.find(info.socket);
if (proc_it != inode_proc_map.end()) {

通过遍历socket_list,判断在第一步保存在inode_proc_map中的inode信息与info中的inode信息是否一致,如果一致,说明就是我们需要的那个进程的网络连接的信息。最终保存我们查询到的信息results.push_back(std::move(r));
到这里,我们就查询到了进程的所有的网络连接的信息。最终通过osquery展现。

osquery> select * from process_open_sockets where pid=14960; 
+-------+----+--------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+
| pid   | fd | socket | family | protocol | local_address | remote_address | local_port | remote_port | path | state       | net_namespace |
+-------+----+--------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+
| 14960 | 2  | 307410 | 2      | 6        | 192.168.2.156 | 192.168.2.145  | 51118      | 8888        |      | ESTABLISHED | 4026531956    |
+-------+----+--------+--------+----------+---------------+----------------+------------+-------------+------+-------------+---------------+

以上就是整个osquery执行process_open_sockets表查询的整个流程。

扩展

Linux一些皆文件的特性,使得我们能够通过读取Linux下某些文件信息获取系统/进程所有的信息。在前面我们仅仅是从osquery的角度来分析的。本节主要是对Linux中的与网络有关、进程相关的信息进行说明。

/proc/net/tcp/proc/net/udp中保存了当前系统中所有的进程信息,与/proc/pid/net/tcp或者是/proc/pid/net/udp中保存的信息完全相同。

/proc/net/tcp信息如下:

sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
0: 00000000:1538 00000000:0000 0A 00000000:00000000 00:00000000 00000000    26        0 26488 1 ffff912c69c21740 100 0 0 10 0
1: 0100007F:0019 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 28721 1 ffff912c69c23640 100 0 0 10 0
2: 00000000:01BB 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 27739 1 ffff912c69c21f00 100 0 0 10 0
3: 00000000:1F40 00000000:0000 0A 00000000:00000000 00:00000000 00000000  1000        0 471681 1 ffff912c37488f80 100 0 0 10 0
4: 0100007F:18EB 00000000:0000 0A 00000000:00000000 00:00000000 00000000   988        0 25611 1 ffff912c69c207c0 100 0 0 10 0
5: 00000000:0050 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 27737 1 ffff912c69c226c0 100 0 0 10 0
6: 017AA8C0:0035 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 29031 1 ffff912c69c23e00 100 0 0 10 0
7: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 25754 1 ffff912c69c20f80 100 0 0 10 0
8: 0100007F:0277 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 25590 1 ffff912c69c20000 100 0 0 10 0
9: 9C02A8C0:C7AE 9102A8C0:22B8 01 00000000:00000000 00:00000000 00000000  1000        0 307410 1 ffff912c374887c0 20 0 0 10 -1

/proc/14960/net/tcp信息如下:

sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
0: 00000000:1538 00000000:0000 0A 00000000:00000000 00:00000000 00000000    26        0 26488 1 ffff912c69c21740 100 0 0 10 0
1: 0100007F:0019 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 28721 1 ffff912c69c23640 100 0 0 10 0
2: 00000000:01BB 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 27739 1 ffff912c69c21f00 100 0 0 10 0
3: 0100007F:18EB 00000000:0000 0A 00000000:00000000 00:00000000 00000000   988        0 25611 1 ffff912c69c207c0 100 0 0 10 0
4: 00000000:0050 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 27737 1 ffff912c69c226c0 100 0 0 10 0
5: 017AA8C0:0035 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 29031 1 ffff912c69c23e00 100 0 0 10 0
6: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 25754 1 ffff912c69c20f80 100 0 0 10 0
7: 0100007F:0277 00000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 25590 1 ffff912c69c20000 100 0 0 10 0
8: 9C02A8C0:C7AE 9102A8C0:22B8 01 00000000:00000000 00:00000000 00000000  1000        0 307410 1 ffff912c374887c0 20 0 0 10 -1

我们每一列的含义都是固定的,我们以最终一列9C02A8C0:C7AE 9102A8C0:22B8 01 00000000:00000000 00:00000000 00000000 1000 0 307410 1 ffff912c374887c0 20 0 0 10 -1为例进行说明。

1.local_address,本地通讯端口和IP,本例是9C02A8C0:C7AE9C02A8C0,是本地IP。9C02A8C0是十六进制,转换为十进制是2617419968,将其转换为IP地址则是156.2.168.192,倒装一下得到192.168.2.156C7AE转化为十进制是51118。所以当进行网络通信时,得到本地IP是192.168.2.156,端口是51118

2.rem_address,远程服务器通信端口和IP,本例是9102A8C0:22B89102A8C0是远程IP。分析方法和local_address相同,得到远程IP是192.168.2.145,端口是8888

3.st,socket的状态,本例是01st的不同的值表示不同的含义。

  • 01: ESTABLISHED,
  • 02: SYN_SENT
  • 03: SYN_RECV
  • 04: FIN_WAIT1
  • 05: FIN_WAIT2
  • 06: TIME_WAIT
  • 07: CLOSE
  • 08: CLOSE_WAIT
  • 09: LAST_ACK
  • 0A: LISTEN
  • 0B: CLOSING

所以在本例中01则说明是ESTABLISHED状态。

4.tx_queue, 表示发送队列中的数据长度,本例是00000000

5.rx_queue, 如果状态是ESTABLISHED,表示接受队列中数据长度;如果是LISTEN,表示已完成连接队列的长度;

6.tr,定时器类型。为0,表示没有启动计时器;为1,表示重传定时器;为2,表示连接定时器;为3,表示TIME_WAIT定时器;为4,表示持续定时器;

7.tm->when,超时时间。

8.retrnsmt,超时重传次数

9.uid,用户id

10.timeout,持续定时器或保洁定时器周期性发送出去但未被确认的TCP段数目,在收到ACK之后清零

11.inode,socket连接对应的inode

12.1,没有显示header,表示的是socket的引用数目

13.ffff912c374887c0,没有显示header,表示sock结构对应的地址

14.20,没有显示header,表示RTO,单位是clock_t

15.0,用来计算延时确认的估值

16.0,快速确认数和是否启用标志位的或元算结果

17.10,当前拥塞窗口大小

18.-1,如果慢启动阈值大于等于0x7fffffff显示-1,否则表示慢启动阈值

proc_net_tcp_decode这篇文章对每个字段也进行了详细地说明。

通过查看某个具体的pidfd信息,检查是否存在以socket:开头的文件描述符,如果存在则说明存在网络通信。

3.jpg

在得到了socket所对应的inode之后,就可以在/proc/net/tcp中查询对应的socket的信息,比如远程服务器的IP和端口信息。这样通过socketinode就可以关联进程信息和它的网络信息。

总结

论读源代码的重要性

以上

osquery源码解读之分析shell_history

说明

前面两篇主要是对osquery的使用进行了说明,本篇文章将会分析osquery的源码。本文将主要对shell_historyprocess_open_sockets两张表进行说明。通过对这些表的实现分析,一方面能够了解osquery的实现通过SQL查询系统信息的机制,另一方面可以加深对Linux系统的理解。

表的说明

shell_history是用于查看shell的历史记录,而process_open_sockets是用于记录主机当前的网络行为。示例用法如下:

shell_history

osquery> select * from shell_history limit 3;
+------+------+-------------------------------------------------------------------+-----------------------------+
| uid  | time | command                                                           | history_file                |
+------+------+-------------------------------------------------------------------+-----------------------------+
| 1000 | 0    | pwd                                                               | /home/username/.bash_history |
| 1000 | 0    | ps -ef                                                            | /home/username/.bash_history |
| 1000 | 0    | ps -ef | grep java                                                | /home/username/.bash_history |
+------+------+-------------------------------------------------------------------+-----------------------------+

process_open_socket显示了一个反弹shell的链接。

osquery> select * from process_open_sockets order by pid desc limit 1;
+--------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+------------+---------------+
| pid    | fd | socket   | family | protocol | local_address | remote_address | local_port | remote_port | path | state      | net_namespace |
+--------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+------------+---------------+
| 115567 | 3  | 16467630 | 2      | 6        | 192.168.2.142 | 192.168.2.143  | 46368      | 8888        |      | ESTABLISH  | 0             |
+--------+----+----------+--------+----------+---------------+----------------+------------+-------------+------+------------+---------------+

osquery整体的代码结构十分地清晰。所有表的定义都是位于specs下面,所有表的实现都是位于osquery/tables

我们以shell_history为例,其表的定义是在specs/posix/shell_history.table

table_name("shell_history")
description("A line-delimited (command) table of per-user .*_history data.")
schema([
    Column("uid", BIGINT, "Shell history owner", additional=True),
    Column("time", INTEGER, "Entry timestamp. It could be absent, default value is 0."),
    Column("command", TEXT, "Unparsed date/line/command history line"),
    Column("history_file", TEXT, "Path to the .*_history for this user"),
    ForeignKey(column="uid", table="users"),
])
attributes(user_data=True, no_pkey=True)
implementation("shell_history@genShellHistory")
examples([
    "select * from users join shell_history using (uid)",
])
fuzz_paths([
    "/home",
    "/Users",
])s

shell_history.table中已经定义了相关的信息,入口是shell_history.cpp中的genShellHistory()函数,甚至给出了示例的SQL语句select * from users join shell_history using (uid)shell_history.cpp是位于osquery/tables/system/posix/shell_history.cpp中。

同理,process_open_sockets的表定义位于specs/process_open_sockets.table,实现位于osquery/tables/networking/[linux|freebsd|windows]/process_open_sockets.cpp。可以看到由于process_open_sockets在多个平台上面都有,所以在linux/freebsd/windows中都存在process_open_sockets.cpp的实现。本文主要是以linux为例。

shell_history实现

前提知识

在分析之前,介绍一下Linux中的一些基本概念。我们常常会看到各种不同的unix shell,如bash、zsh、tcsh、sh等等。bash是我们目前最常见的,它几乎是所有的类unix操作中内置的一个shell。而zsh相对于bash增加了更多的功能。我们在终端输入各种命令时,其实都是使用的这些shell。

我们在用户的根目录下方利用ls -all就可以发现存在.bash_history文件,此文件就记录了我们在终端中输入的所有的命令。同样地,如果我们使用zsh,则会存在一个.zsh_history记录我们的命令。

同时在用户的根目录下还存在.bash_sessions的目录,根据这篇文章的介绍:

A new folder (~/.bash_sessions/) is used to store HISTFILE’s and .session files that are unique to sessions. If $BASH_SESSION or $TERM_SESSION_ID is set upon launching the shell (i.e. if Terminal is resuming from a saved state), the associated HISTFILE is merged into the current one, and the .session file is ran. Session saving is facilitated by means of an EXIT trap being set for a function bash_update_session_state.

.bash_sessions中存储了特定SESSION的HISTFILE和.session文件。如果在启动shell时设置了$BASH_SESSION$TERM_SESSION_ID。当此特定的SESSION启动了之后就会利用$BASH_SESSION$TERM_SESSION_ID恢复之前的状态。这也说明在.bash_sessions目录下也会存在*.history用于记录特定SESSION的历史命令信息。

分析

QueryData genShellHistory(QueryContext& context) {
    QueryData results;
    // Iterate over each user
    QueryData users = usersFromContext(context);
    for (const auto& row : users) {
        auto uid = row.find("uid");
        auto gid = row.find("gid");
        auto dir = row.find("directory");
        if (uid != row.end() && gid != row.end() && dir != row.end()) {
            genShellHistoryForUser(uid->second, gid->second, dir->second, results);
            genShellHistoryFromBashSessions(uid->second, dir->second, results);
        }
    }

    return results;
}

分析shell_history.cpp的入口函数genShellHistory():

遍历所有的用户,拿到uidgiddirectory。之后调用genShellHistoryForUser()获取用户的shell记录genShellHistoryFromBashSessions()genShellHistoryForUser()作用类似。

genShellHistoryForUser():

void genShellHistoryForUser(const std::string& uid, const std::string& gid, const std::string& directory, QueryData& results) {
    auto dropper = DropPrivileges::get();
    if (!dropper->dropTo(uid, gid)) {
        VLOG(1) << "Cannot drop privileges to UID " << uid;
        return;
    }

    for (const auto& hfile : kShellHistoryFiles) {
        boost::filesystem::path history_file = directory;
        history_file /= hfile;
        genShellHistoryFromFile(uid, history_file, results);
    }
}

可以看到在执行之前调用了:

auto dropper = DropPrivileges::get();
if (!dropper->dropTo(uid, gid)) {
    VLOG(1) << "Cannot drop privileges to UID " << uid;
    return;
}

用于对giduid降权,为什么要这么做呢?后来询问外国网友,给了一个很详尽的答案:

Think about a scenario where you are a malicious user and you spotted a vulnerability(buffer overflow) which none of us has. In the code (osquery which is running usually with root permission) you also know that history files(controlled by you) are being read by code(osquery). Now you stored a shell code (a code which is capable of destroying anything in the system)such a way that it would overwrite the saved rip. So once the function returns program control is with the injected code(shell code) with root privilege. With dropping privilege you reduce the chance of putting entire system into danger.

There are other mitigation techniques (e.g. stack guard) to avoid above scenario but multiple defenses are required

简而言之,osquery一般都是使用root权限运行的,如果攻击者在.bash_history中注入了一段恶意的shellcode代码。那么当osquery读到了这个文件之后,攻击者就能够获取到root权限了,所以通过降权的方式就能够很好地避免这样的问题。

/**
* @brief The privilege/permissions dropper deconstructor will restore
* effective permissions.
*
* There should only be a single drop of privilege/permission active.
*/
virtual ~DropPrivileges();

可以看到当函数被析构之后,就会重新恢复对应文件的权限。

之后遍历kShellHistoryFiles文件,执行genShellHistoryFromFile()代码。kShellHistoryFiles在之前已经定义,内容是:

const std::vector<std::string> kShellHistoryFiles = {
    ".bash_history", ".zsh_history", ".zhistory", ".history", ".sh_history",
};

可以发现其实在kShellHistoryFiles定义的就是常见的bash用于记录shell history目录的文件。最后调用genShellHistoryFromFile()读取.history文件,解析数据。

void genShellHistoryFromFile(const std::string& uid, const boost::filesystem::path& history_file, QueryData& results) {
    std::string history_content;
    if (forensicReadFile(history_file, history_content).ok()) {
        auto bash_timestamp_rx = xp::sregex::compile("^#(?P<timestamp>[0-9]+)$");
        auto zsh_timestamp_rx = xp::sregex::compile("^: {0,10}(?P<timestamp>[0-9]{1,11}):[0-9]+;(?P<command>.*)$");
        std::string prev_bash_timestamp;
        for (const auto& line : split(history_content, "\n")) {
            xp::smatch bash_timestamp_matches;
            xp::smatch zsh_timestamp_matches;

            if (prev_bash_timestamp.empty() &&
                xp::regex_search(line, bash_timestamp_matches, bash_timestamp_rx)) {
                prev_bash_timestamp = bash_timestamp_matches["timestamp"];
                continue;
            }

            Row r;

            if (!prev_bash_timestamp.empty()) {
                r["time"] = INTEGER(prev_bash_timestamp);
                r["command"] = line;
                prev_bash_timestamp.clear();
            } else if (xp::regex_search(
                    line, zsh_timestamp_matches, zsh_timestamp_rx)) {
                std::string timestamp = zsh_timestamp_matches["timestamp"];
                r["time"] = INTEGER(timestamp);
                r["command"] = zsh_timestamp_matches["command"];
            } else {
                r["time"] = INTEGER(0);
                r["command"] = line;
            }

            r["uid"] = uid;
            r["history_file"] = history_file.string();
            results.push_back(r);
        }
    }
}

整个代码逻辑非常地清晰。

  1. forensicReadFile(history_file, history_content)读取文件内容。
  2. 定义bash_timestamp_rxzsh_timestamp_rx的正则表达式,用于解析对应的.history文件的内容。 for (const auto& line : split(history_content, "\n"))读取文件的每一行,分别利用bash_timestamp_rxzsh_timestamp_rx解析每一行的内容。
  3. Row r;...;r["history_file"] = history_file.string();results.push_back(r);将解析之后的内容写入到Row中返回。

自此就完成了shell_history的解析工作。执行select * from shell_history就会按照上述的流程返回所有的历史命令的结果。

对于genShellHistoryFromBashSessions()函数:

void genShellHistoryFromBashSessions(const std::string &uid,const std::string &directory,QueryData &results) {
    boost::filesystem::path bash_sessions = directory;
    bash_sessions /= ".bash_sessions";

    if (pathExists(bash_sessions)) {
        bash_sessions /= "*.history";
        std::vector <std::string> session_hist_files;
        resolveFilePattern(bash_sessions, session_hist_files);

        for (const auto &hfile : session_hist_files) {
            boost::filesystem::path history_file = hfile;
            genShellHistoryFromFile(uid, history_file, results);
        }
    }
}

genShellHistoryFromBashSessions()获取历史命令的方法比较简单。

  1. 获取到.bash_sessions/*.history所有的文件;
  2. 同样调用genShellHistoryFromFile(uid, history_file, results);方法获取到历史命令;

总结

阅读一些优秀的开源软件的代码,不仅能够学习到相关的知识更能够了解到一些设计哲学。拥有快速学习能⼒的⽩帽子,是不能有短板的。有的只是⼤量的标准板和⼏块长板。

osquery初识

0x01 说明

osquery是一个由FaceBook开源用于对系统进行查询、监控以及分析的一款软件。osquery对其的说明如下:

osquery exposes an operating system as a high-performance relational database. This allows you to write SQL-based queries to explore operating system data. With osquery, SQL tables represent abstract concepts such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events or file hashes.

我们知道当你们在Linux中使用诸如pstopls -l等等命令的时候,可以发下其实他们的输出结果的格式都是很固定的很像一张表。或许是基于这样的想法,facebook开发了osquery。osquery将操作系统当作是一个高性能的关系型数据库。使用osquery运行我们能够使用类似于SQL语句的方式去查询数据库中的信息,比如正在运行的进程信息,加载的内核模块,网络连接,浏览器插件等等信息(一切查询的信息的粒度取决于osquery的实现粒度了)。

osquery也广泛地支持多个平台,包括MacOS、CentOS、Ubuntu、Windows 10以及FreeBSD,具体所支持的版本的信息也可以在osquery主页查看。除此之外,osquery的配套文档/网站也是一应俱全,包括主页Githubreadthedocsslack

本篇文章以CentOS为例说明Osquery的安装以及使用。

0x02 安装

主页上面提供了不同操作系统的安装包,我们下载CentOS对应的rpm文件即可。在本例中文件名是osquery-3.3.0-1.linux.x86_64.rpm,使用命令sudo yum install osquery-3.3.0-1.linux.x86_64.rpm安装。安装成功之后会出现:

Installed:
  osquery.x86_64 0:3.3.0-1.linux                                                                                                                                                             
Complete!

0x03 运行

osquery存在两种运行模式,分别是osqueryi(交互式模式)、osqueryd(后台进程模式)。

  • osqueryi,与osqueryd安全独立,不需要以管理员的身份运行,能够及时地查看当前操作系统的状态信息。
  • osqueryd,我们能够利用osqueryd执行定时查询记录操作系统的变化,例如在第一次执行和第二次执行之间的进程变化(增加/减少),osqueryd会将进程执行的结果保存(文件或者是直接打到kafka中)。osqueryd还会利用操作系统的API来记录文件目录的变化、硬件事件、网络行为的变化等等。osqueryd在Linux中是以系统服务的方式来运行。

为了便于演示,我们使用osqueyi来展示osquery强大的功能。我们直接在terminal中输入osqueryi即可进入到osqueryi的交互模式中(osqueryi采用的是sqlite的shell的语法,所以我们也可以使用在sqlite中的所有的内置函数)。

[user@localhost Desktop]$ osqueryi
Using a virtual database. Need help, type '.help'
osquery> .help
Welcome to the osquery shell. Please explore your OS!
You are connected to a transient 'in-memory' virtual database.

.all [TABLE]     Select all from a table
.bail ON|OFF     Stop after hitting an error
.echo ON|OFF     Turn command echo on or off
.exit            Exit this program
.features        List osquery's features and their statuses
.headers ON|OFF  Turn display of headers on or off
.help            Show this message
.mode MODE       Set output mode where MODE is one of:
                   csv      Comma-separated values
                   column   Left-aligned columns see .width
                   line     One value per line
                   list     Values delimited by .separator string
                   pretty   Pretty printed SQL results (default)
.nullvalue STR   Use STRING in place of NULL values
.print STR...    Print literal STRING
.quit            Exit this program
.schema [TABLE]  Show the CREATE statements
.separator STR   Change separator used by output mode
.socket          Show the osquery extensions socket path
.show            Show the current values for various settings
.summary         Alias for the show meta command
.tables [TABLE]  List names of tables
.width [NUM1]+   Set column widths for "column" mode
.timer ON|OFF      Turn the CPU timer measurement on or off

通过.help,我们能够查看在osqueryi模式下的一些基本操作。比如.exit表示退出osqueryi,.mode切换osqueryi的输出结果,.show展示目前osqueryi的配置信息,.tables展示在当前的操作系统中能够支持的所有的表名。.schema [TABLE]显示具体的表的结构信息。

osquery> .show
osquery - being built, with love, at Facebook

osquery 3.3.0
using SQLite 3.19.3

General settings:
     Flagfile: 
       Config: filesystem (/etc/osquery/osquery.conf)
       Logger: filesystem (/var/log/osquery/)
  Distributed: tls
     Database: ephemeral
   Extensions: core
       Socket: /home/xingjun/.osquery/shell.em

Shell settings:
         echo: off
      headers: on
         mode: pretty
    nullvalue: ""
       output: stdout
    separator: "|"
        width: 

Non-default flags/options:
  database_path: /home/xingjun/.osquery/shell.db
  disable_database: true
  disable_events: true
  disable_logging: true
  disable_watchdog: true
  extensions_socket: /home/xingjun/.osquery/shell.em
  hash_delay: 0
  logtostderr: true
  stderrthreshold: 3

可以看到设置包括常规设置(General settings)、shell设置(Shell settings)、非默认选项(Non-default flags/options)。在常规设置中主要是显示了各种配置文件的位置(配置文件/存储日志文件的路径)。 在shell设置中包括了是否需要表头信息(headers),显示方式(mode: pretty),分隔符(separator: "|")。

.table可以查看在当前操作系统中所支持的所有的表,虽然在schema中列出了所有的表(包括了win平台,MacOS平台,Linux平台)。但是具体到某一个平台上面是不会包含其他平台上的表。下方显示的就是我在CentOS7下显示的表。

osquery> .table
  => acpi_tables
  => apt_sources
  => arp_cache
  => augeas
  => authorized_keys
  => block_devices
  => carbon_black_info
  => carves
  => chrome_extensions
  => cpu_time
  => cpuid
  => crontab
...

.schema [TABLE]可以用于查看具体的表的结构信息。如下所示:

osquery> .schema users
CREATE TABLE users(`uid` BIGINT, `gid` BIGINT, `uid_signed` BIGINT, `gid_signed` BIGINT, `username` TEXT, `description` TEXT, `directory` TEXT, `shell` TEXT, `uuid` TEXT, `type` TEXT HIDDEN, PRIMARY KEY (`uid`, `username`)) WITHOUT ROWID;
osquery> .schema processes
CREATE TABLE processes(`pid` BIGINT, `name` TEXT, `path` TEXT, `cmdline` TEXT, `state` TEXT, `cwd` TEXT, `root` TEXT, `uid` BIGINT, `gid` BIGINT, `euid` BIGINT, `egid` BIGINT, `suid` BIGINT, `sgid` BIGINT, `on_disk` INTEGER, `wired_size` BIGINT, `resident_size` BIGINT, `total_size` BIGINT, `user_time` BIGINT, `system_time` BIGINT, `disk_bytes_read` BIGINT, `disk_bytes_written` BIGINT, `start_time` BIGINT, `parent` BIGINT, `pgroup` BIGINT, `threads` INTEGER, `nice` INTEGER, `is_elevated_token` INTEGER HIDDEN, `upid` BIGINT HIDDEN, `uppid` BIGINT HIDDEN, `cpu_type` INTEGER HIDDEN, `cpu_subtype` INTEGER HIDDEN, `phys_footprint` BIGINT HIDDEN, PRIMARY KEY (`pid`)) WITHOUT ROWID;

上面通过.schema查看usersprocesses表的信息,结果输出的是他们对应的DDL。

0x03 基本使用

在本章节中,将会演示使用osqueryi来实时查询操作系统中的信息(为了方便展示查询结果使用的是.mode line模式)。

查看系统信息

osquery> select * from system_info;
          hostname = localhost
              uuid = 4ee0ad05-c2b2-47ce-aea1-c307e421fa88
          cpu_type = x86_64
       cpu_subtype = 158
         cpu_brand = Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
cpu_physical_cores = 1
 cpu_logical_cores = 1
     cpu_microcode = 0x84
   physical_memory = 2924228608
   hardware_vendor = 
    hardware_model = 
  hardware_version = 
   hardware_serial = 
     computer_name = localhost.localdomain
    local_hostname = localhost

查询的结果包括了CPU的型号,核数,内存大小,计算机名称等等;

查看OS版本

osquery> select * from os_version;
         name = CentOS Linux
      version = CentOS Linux release 7.4.1708 (Core)
        major = 7
        minor = 4
        patch = 1708
        build = 
     platform = rhel
platform_like = rhel
     codename =

以看到我的本机的操作系统的版本是CentOS Linux release 7.4.1708 (Core)

查看内核信息版本

osquery> SELECT * FROM kernel_info;
  version = 3.10.0-693.el7.x86_64
arguments = ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8
     path = /vmlinuz-3.10.0-693.el7.x86_64
   device = /dev/mapper/centos-root

osquery> SELECT * FROM kernel_modules LIMIT 3;
   name = tcp_lp
   size = 12663
used_by = -
 status = Live
address = 0xffffffffc06cf000

   name = fuse
   size = 91874
used_by = -
 status = Live
address = 0xffffffffc06ae000

   name = xt_CHECKSUM
   size = 12549
used_by = -
 status = Live
address = 0xffffffffc06a9000

查询repo和pkg信息

osquery提供查询系统中的repo和okg相关信息的表。在Ubuntu中对应的是apt相关的包信息,在Centos中对应的是yum相关的包信息。本例均以yum包为例进行说明

osquery> SELECT * FROM yum_sources  limit 2;
    name = CentOS-$releasever - Base
 baseurl = 
 enabled = 
gpgcheck = 1
  gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

    name = CentOS-$releasever - Updates
 baseurl = 
 enabled = 
gpgcheck = 1
  gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

我们可以直接利用yum_sources来查看操作系统的yum源相关的信息。

osquery> SELECT name, version FROM rpm_packages order by name limit 3;
   name = GConf2
version = 3.2.6

   name = GeoIP
version = 1.5.0

   name = ModemManager
version = 1.6.0

利用rpm_packages查看系统中已经安装的rpm包信息。我们也可以通过name对我们需要查询的包进行过滤,如下:

osquery> SELECT name, version FROM rpm_packages where name="osquery";
   name = osquery
version = 3.3.0

挂载信息

我们可以使用mounts表来查询系统中的具体的驱动信息。例如我们可以如下的SQL语句进行查询:

SELECT * FROM mounts;
SELECT device, path, type, inodes_free, flags FROM mounts;

我们也可以使用where语句查询摸一个具体的驱动信息,例如ext4或者是tmpfs信息。如下:

osquery> SELECT device, path, type, inodes_free, flags FROM mounts WHERE type="ext4";
osquery> SELECT device, path, type, inodes_free, flags FROM mounts WHERE type="tmpfs";
     device = tmpfs
       path = /dev/shm
       type = tmpfs
inodes_free = 356960
      flags = rw,seclabel,nosuid,nodev

     device = tmpfs
       path = /run
       type = tmpfs
inodes_free = 356386
      flags = rw,seclabel,nosuid,nodev,mode=755

     device = tmpfs
       path = /sys/fs/cgroup
       type = tmpfs
inodes_free = 356945
      flags = ro,seclabel,nosuid,nodev,noexec,mode=755

     device = tmpfs
       path = /run/user/42
       type = tmpfs
inodes_free = 356955
      flags = rw,seclabel,nosuid,nodev,relatime,size=285572k,mode=700,uid=42,gid=42

     device = tmpfs
       path = /run/user/1000
       type = tmpfs
inodes_free = 356939
      flags = rw,seclabel,nosuid,nodev,relatime,size=285572k,mode=700,uid=1000,gid=1000

内存信息

使用memory_info查看内存信息,如下:

osquery> select * from memory_info;
memory_total = 2924228608
 memory_free = 996024320
     buffers = 4280320
      cached = 899137536
 swap_cached = 0
      active = 985657344
    inactive = 629919744
  swap_total = 2684350464
   swap_free = 2684350464

网卡信息

使用interface_addresses查看网卡信息,如下:

osquery> SELECT * FROM interface_addresses;
     interface = lo
       address = 127.0.0.1
          mask = 255.0.0.0
     broadcast = 
point_to_point = 127.0.0.1
          type = 

     interface = virbr0
       address = 192.168.122.1
          mask = 255.255.255.0
     broadcast = 192.168.122.255
point_to_point = 
          type = 

     interface = lo
       address = ::1
          mask = ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
     broadcast = 
point_to_point = 
          type =

还可以使用interface_details查看更加具体的网卡信息。

SELECT * FROM interface_details;
SELECT interface, mac, ipackets, opackets, ibytes, obytes FROM interface_details;

查询结果如下

osquery> SELECT * FROM interface_details;
  interface = lo
        mac = 00:00:00:00:00:00
       type = 4
        mtu = 65536
     metric = 0
      flags = 65609
   ipackets = 688
   opackets = 688
     ibytes = 59792
     obytes = 59792
    ierrors = 0
    oerrors = 0
     idrops = 0
     odrops = 0
 collisions = 0
last_change = -1
 link_speed = 
   pci_slot = 
    ....

系统启动时间

osquery> select * from uptime;
         days = 0
        hours = 2
      minutes = 23
      seconds = 51
total_seconds = 8631

查询用户信息

osquery提供了多个表用于查询用户的信息,包括使用users表检索系统中所有的用户,使用last表查看用户上次登录的信息,使用logged_in_user查询具有活动shell的用户信息。

使用select * from users查看所有用户信息,使用类似于uid>1000的方式过滤用户。

osquery> select * from users where uid>1000;
        uid = 65534
        gid = 65534
 uid_signed = 65534
 gid_signed = 65534
   username = nfsnobody
description = Anonymous NFS User
  directory = /var/lib/nfs
      shell = /sbin/nologin
       uuid =

我们可以使用last表查询最终的登录信息,如SELECT * FROM last;。对于普通用户来说,其type值为7。那么我们的查询条件如下:

osquery> SELECT * FROM last where type=7;
username = user
     tty = :0
     pid = 12776
    type = 7
    time = 1539882439
    host = :0

username = user
     tty = pts/0
     pid = 13754
    type = 7
    time = 1539882466
    host = :0

其中的time是时间戳类型,转换为具体的日期之后就可以看到具体的登录时间了。

使用SELECT * FROM logged_in_users;查看当前已经登录的用户信息。

防火墙信息

我们可以使用iptables来查看具体的防火墙信息,如select * from iptables;,也可以进行过滤查询具体的防火墙信息。如SELECT chain, policy, src_ip, dst_ip FROM iptables WHERE chain="POSTROUTING" order by src_ip;

进程信息

我们可以使用processes来查询系统上进程的信息,包括pid,name,path,command等等。
可以使用select * from processes;或者查看具体的某几项信息,select pid,name,path,cmdline from processes;

osquery> select pid,name,path,cmdline from processes limit 2;
    pid = 1
   name = systemd
   path = 
cmdline = /usr/lib/systemd/systemd --switched-root --system --deserialize 21

    pid = 10
   name = watchdog/0
   path = 
cmdline =

检查计划任务

我们可以使用crontab来检查系统中的计划任务。

osquery> select * from crontab;
       event = 
      minute = 01
        hour = *
day_of_month = *
       month = *
 day_of_week = *
     command = root run-parts /etc/cron.hourly
        path = /etc/cron.d/0hourly

       event = 
      minute = 0
        hour = 1
day_of_month = *
       month = *
 day_of_week = Sun
     command = root /usr/sbin/raid-check
        path = /etc/cron.d/raid-check

其他

在Linux中还存在其他很多的表能够帮助我们更好地进行入侵检测相关的工作,包括process_eventssocket_eventsprocess_open_sockets等等,这些表可供我们进行入侵检测的确认工作。至于这些表的工作原理,有待阅读osquery的源代码进行进一步分析。

0x04 总结

本文主要是对Osquery的基础功能进行了介绍。Oquery的强大功能需要进一步地挖掘和发现。总体来说,Osquery将操作系统中的信息抽象成为一张张表,对于进行基线检查,系统监控是一个非常优雅的方式。当然由于Osquery在这方面的优势,也可以考虑将其作为HIDS的客户端,但是如果HIDS仅仅只有Osquery也显然是不够的。

以上