{ "cells": [ { "cell_type": "markdown", "id": "01a4a79a", "metadata": {}, "source": [ "# Graph Convolutional Networks (GCN)" ] }, { "cell_type": "code", "execution_count": 1, "id": "b6352d74-08f4-489b-bafb-03116d52d26f", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/gihan/.env/p11/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "import torch.nn as nn\n", "import torch\n", "from torch_geometric.utils import dense_to_sparse\n", "from torch.nn import Linear, Parameter\n", "from torch_geometric.nn import MessagePassing\n", "from torch_geometric.utils import add_self_loops, degree" ] }, { "cell_type": "markdown", "id": "1e8dbb55-4380-4a6a-9c08-e76eeada4f8a", "metadata": {}, "source": [ "Let's build a graph convolutional neural network. The equation to update node features in GCNs can be defined as below." ] }, { "cell_type": "markdown", "id": "815a6ad3-c564-4537-a521-45ded2ad7f06", "metadata": {}, "source": [ "$x_{i}^{(k)} = \\sum_{j \\in N(i) \\bigcup i} \\frac{1}{\\sqrt(deg(i))\\cdot\\sqrt(deg(j)) } \\cdot (W^{T} \\cdot x_{j}^{(k-1)})+b$ " ] }, { "cell_type": "markdown", "id": "8334485b-d949-49ae-9ec2-72b13aebdbf9", "metadata": {}, "source": [ "$x$ are node features.
\n", "$N(i)$ are the neighbors of node $i$
\n", "$deg(i)$ and $deg(j)$ are the degrees of nodes $i$ and $j$. The degree refers to the number of edges attached to a given node.
\n", "$W$ is a matrix that transforms the node features. The elements of this matrix are trainable.
\n", "$b$ is a bias value." ] }, { "cell_type": "markdown", "id": "279b7cf6-387a-4d33-80c6-78fe02aace60", "metadata": {}, "source": [ "Here, k is used to denote the graph convolution layer. This can be identified as a single pass through the graph convolution operation. In other words, a single iteration of using the above equation on the current set of node (and/or edge) features." ] }, { "cell_type": "markdown", "id": "15a8a1e2-94b2-4ccd-be55-464b246ab2ab", "metadata": {}, "source": [ "Let's understand what happens here. In this equation, we update the the node features of node $i$. The subscript of the summation symbol indicates that we consider the neighbors of node $i$ and node $i$ itself. For a given neighbor $j$, we transform it's node feature $x_{j}$ using the weight matrix $W$. We divide this value by the product of the square root of the degrees of nodes $i$ and $j$.\n", "\n", "We do this operation for all the neighboring nodes of $i$." ] }, { "cell_type": "markdown", "id": "043100bc-338e-4938-b0ef-e19f6769c995", "metadata": {}, "source": [ "Then we collect all these transformed node features of the neighbhring nodes corresponding to node $i$ and add them all together.\n", "Optionally, we can add a bias term $b$ to this summation. This summation becomes the new node feature of node $i$, $x_{i}^{k}$ at iteration $k$." ] }, { "cell_type": "markdown", "id": "bcba0f05-3bc7-4738-9716-a1603df4ab23", "metadata": {}, "source": [ "![title](assets/gcn1.png)" ] }, { "cell_type": "markdown", "id": "cedabe6e-f9a0-4e68-ba45-7103b82ea85a", "metadata": {}, "source": [ "\n", "Now let's build graph convolution nework using torch_geometric. This implementation is based on the code segment provided in their documentation. I did some modifications to make it easy to compare with another implementaion. Specifically I removed `self.lin.reset_parameters()` and added my own weight matrx $W$,\n", "and set the bias values to zero." ] }, { "cell_type": "code", "execution_count": 2, "id": "82e622d8-17e1-4996-9e7b-6c6a6a19a665", "metadata": {}, "outputs": [], "source": [ "class GCNConv(MessagePassing):\n", " def __init__(self, in_channels, out_channels, weight_data):\n", " super().__init__(aggr='add') \n", " self.lin = Linear(in_channels, out_channels, bias=True)\n", "\n", "\n", " self.lin.weight.data = weight_data\n", " self.lin.bias.data = torch.Tensor([0., 0.])\n", "\n", " self.b = Parameter(torch.empty(out_channels))\n", " self.b.data.zero_()\n", "\n", "\n", " def forward(self, x, edge_index):\n", "\n", " x = self.lin(x) # line 1\n", "\n", " source, target = edge_index # line 2\n", " deg = degree(target, x.size(1), dtype=x.dtype) # line 3\n", "\n", " deg_inv_sqrt = deg**(-0.5) # line 4\n", "\n", " # if the inverse square root is infinity, that value is set to zero.\n", " deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0 # line 5\n", " norm = deg_inv_sqrt[source] * deg_inv_sqrt[target] # line 6\n", " # print(norm)\n", "\n", " # propagate is a method implemented in MessagePassing. This is where \n", " # massage passing takes place. you have to send the edge indices and the node features.\n", " # additionally, you can send other optional values to create your message from node j to node i.\n", " out = self.propagate(edge_index, x=x, norm=norm) # line 7\n", " out = out + self.b # line 8\n", "\n", " return out\n", "\n", " def message(self, x_j, norm):\n", " # this is wehere we define the message from node j to node i.\n", " # print(x_j)\n", " return norm.view(-1, 1) * x_j" ] }, { "cell_type": "markdown", "id": "74fe7ca1-81db-4d43-93a1-1526fa47e2b4", "metadata": {}, "source": [ "### Let's understand what happens at each line of the forward method." ] }, { "cell_type": "markdown", "id": "875dcebb-cb0e-47f2-9449-153438311977", "metadata": {}, "source": [ "line 1: We transform the input features using the W matrix and a bias term. This corresponds to the $(W^{T} \\cdot x_{j}^{(k-1)})+b$ part in our equation for $x_{i}^{(k)}$
\n", "\n", "line 2: We seperate the two lists of connected nodes to two lists, $row$ and $col$." ] }, { "cell_type": "markdown", "id": "d56c51b5-19b0-471d-a0f7-53a0c0535faa", "metadata": {}, "source": [ "line 3: we calculate the degree of each node. We use pytoorch_geometrics, [degree function](https://pytorch-geometric.readthedocs.io/en/2.4.0/_modules/torch_geometric/utils/degree.html). The first argument to the degree function is the vector of edge indices. The second argument is the number of nodes in the graph." ] }, { "cell_type": "markdown", "id": "c71f3b51-3d27-48b0-b270-9bee54c5e4cf", "metadata": {}, "source": [ "line 4: Take the square root of the degree values." ] }, { "cell_type": "markdown", "id": "0e08391e-ee97-48cf-8bf1-eab57f47b3cd", "metadata": {}, "source": [ "line 5: if the inverse square root is infinity, that value is set to zero." ] }, { "cell_type": "markdown", "id": "da827ca8-5a49-46e6-8334-494a8ff6fb64", "metadata": {}, "source": [ "line 6: This is where we calculate the factor $\\frac{1}{\\sqrt(deg(i))\\cdot\\sqrt(deg(j)) }$." ] }, { "cell_type": "markdown", "id": "0b57e200-b1ed-41de-b94e-7fa6fb6af1b2", "metadata": {}, "source": [ "line 7: This is the confusing step in building GNNs using pytorch geometric. [propagate](https://pytorch-geometric.readthedocs.io/en/2.4.0/_modules/torch_geometric/nn/conv/message_passing.html#MessagePassing.propagate) method is where the MessagePassing happens. You have to send the edge_index and other additional values that are needed to contstruct the message. How to construct the message is defined in the `message` function. In our case, the message that is passed to the target node is constructed by multiplying the source node's features with the nomalization factor. Once the messages are constructed at each source node, they are aggregated. In our case, the aggreegation is the summing of all the neighbouring source node features. This aggregation operation also takes place inside the `propagate` method.\n", "\n", "Note that in our case, the the neighbouring nodes of a given target node consists of the target node itself. This is becasue we consider self loops. Self loops are the edges that connect a node to itself." ] }, { "cell_type": "markdown", "id": "d9404c04-0302-4668-a1fb-62bfc7141e5b", "metadata": {}, "source": [ "The `flow` parameter in the `MessagePassing` class determines the from where to where the messages are sent. By default, the value of `flow` is \"source_to_target\". If the edge index is a tensor of the shape, 2 $\\times$ 'number of messages' (as in our case). The messages are sent from nodes given in the first list of the `edge_index` tensor to the second index of the `edge_index` tensor. That is, the source nodes are in the first list of the edge_index tensor. The target nodes are in the second list of of the edge_index tensor." ] }, { "cell_type": "code", "execution_count": null, "id": "1f89fe17-c7cc-400c-b754-416654a67602", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "977c9bee-ade3-4f6e-96dd-ef83905a4e57", "metadata": {}, "source": [ "#### Define the node and edge features" ] }, { "cell_type": "markdown", "id": "e4cc9da2-f93a-4a5a-90a5-ed6ac6e6d747", "metadata": {}, "source": [ "These are our node features. Node 0 is featurized with $[0, 1]$. Node 1 is featurized with $[2, 3]$.\n", "Node 2 is featurized with $[4, 5]$. Node 3 is featurized with $[6, 7]$." ] }, { "cell_type": "code", "execution_count": 3, "id": "1c35d6ad-e7f3-42a0-9a42-0f93995811d0", "metadata": {}, "outputs": [], "source": [ "node_feats = torch.tensor([[[0., 1.],\n", " [2., 3.],\n", " [4., 5.],\n", " [6., 7.]]], dtype=torch.float32)" ] }, { "cell_type": "markdown", "id": "b9876565-77b0-42b6-8b20-6e504f27d385", "metadata": {}, "source": [ "The following matrix indicates which two nodes are connected. That is if node $i, j$ are connected with an edge the $i, j$ element of the following matrix is 1." ] }, { "cell_type": "code", "execution_count": 4, "id": "fba13436-1268-48d1-bf35-fb4bcf89a0f1", "metadata": {}, "outputs": [], "source": [ "adj_matrix = torch.Tensor([[[1, 1, 0, 0],\n", " [1, 1, 1, 1],\n", " [0, 1, 1, 1],\n", " [0, 1, 1, 1]]])" ] }, { "cell_type": "markdown", "id": "d167e229-d06e-45bb-a7d2-73afbc407447", "metadata": {}, "source": [ "`weight data` is the $W$ matrix that transforms our node features." ] }, { "cell_type": "code", "execution_count": 5, "id": "064abcc1-235b-470c-9b4a-2528f9f7ab1e", "metadata": {}, "outputs": [], "source": [ "weight_data = torch.Tensor([[1., 0.], [0., 1.]])" ] }, { "cell_type": "markdown", "id": "1f004129-513a-47d5-a5ac-b8b0aed257ad", "metadata": {}, "source": [ "pytorch_geometric requires that adjacency matrix expressed as a two lists. $i^{th}$ elements of the two lists are the node indices that are connected." ] }, { "cell_type": "code", "execution_count": 6, "id": "1b56ce07-7ce1-4ca9-8c3d-4f3eb27376f2", "metadata": {}, "outputs": [], "source": [ "edge_index, _ = dense_to_sparse(adj_matrix)" ] }, { "cell_type": "code", "execution_count": 7, "id": "094e4246-713c-4e60-ba1f-c4502b603c2d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3],\n", " [0, 1, 0, 1, 2, 3, 1, 2, 3, 1, 2, 3]])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "edge_index" ] }, { "cell_type": "markdown", "id": "0ed8652c-7eec-4336-a32f-29c1f95d2cdb", "metadata": {}, "source": [ "Let's just see how the degree values are calcualted." ] }, { "cell_type": "code", "execution_count": 8, "id": "6b3ca564", "metadata": {}, "outputs": [], "source": [ "source, target = edge_index # line 2" ] }, { "cell_type": "code", "execution_count": 9, "id": "80c09b5a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([0, 1, 0, 1, 2, 3, 1, 2, 3, 1, 2, 3])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "target" ] }, { "cell_type": "code", "execution_count": 10, "id": "165a3cff", "metadata": {}, "outputs": [], "source": [ "deg = degree(target, 4) # line 3" ] }, { "cell_type": "code", "execution_count": 11, "id": "4cde147f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([2., 4., 3., 3.])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "deg" ] }, { "cell_type": "markdown", "id": "72d846a1-d2d4-4c97-8d54-c2f608a31976", "metadata": {}, "source": [ "Now, let's create the convolution layer." ] }, { "cell_type": "code", "execution_count": 12, "id": "c1aabf7c-58f1-46cd-ac46-f15132e61623", "metadata": {}, "outputs": [], "source": [ "conv = GCNConv(2, 2, weight_data)" ] }, { "cell_type": "code", "execution_count": 13, "id": "33b4216c-88f2-44e0-9091-663f94f968e2", "metadata": {}, "outputs": [], "source": [ "output_node_features = conv(node_feats, edge_index)" ] }, { "cell_type": "code", "execution_count": 14, "id": "1b3f9316-6031-4588-9df4-6eb8c462948b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[[0.7071, 1.5607],\n", " [3.3868, 4.5677],\n", " [3.9107, 4.8660],\n", " [3.9107, 4.8660]]], grad_fn=)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "output_node_features" ] }, { "cell_type": "code", "execution_count": null, "id": "b036e344-bbf2-4e72-a0d9-78520861d4f6", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "daceb98b-1cb1-49ff-80ec-7c8e53561239", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "6c59f6ba-860d-4dbf-891a-1f9eb4d05210", "metadata": {}, "source": [ "---\n", "\n", "[![Ask questions](https://img.shields.io/static/v1.svg?logo=star&label=❔&message=Ask%20Questions&color=9cf)](https://github.com/gihanpanapitiya/deep_learning/issues) If you have any questions, please add an issue on GitHub. \n", "\n", "---" ] }, { "cell_type": "code", "execution_count": null, "id": "57d0032c-92fe-44ad-bf90-fdc5dfd56a01", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "p11", "language": "python", "name": "p11" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 5 }