Cluster computing

Tuesday, July 23, 2019

Today we describe the authentication for the admin rest API for Keycloak referenced earlier. The API is helpful for Kubernetes cluster security where users are identified with Keycloak. We assume a deployed instance available at http://localhost:8080/auth.

The admin API takes a token from '/realms/master/protocol/openid-connect/token' A token can be requested using password grant and using default admin credentials from keycloak open source.
To clear login failures for all users and release temporarily disabled users, we use:

DELETE /{realm}/attack-detection/brute-force/users

To get the status of a username in brute force detection, we have

GET /{realm}/attack-detection/brute-force/users/{userId}

To get all roles for the realm or client, we have

GET /{realm}/clients/{id}/{roles}

to get a role by name we have

GET /{realm}/clients/{id}/roles/{role-name}

To update a role by role name, we have

PUT /{realm}/clients/{id}/roles/{role-name}

Add a composite to the role, we have

POST /{realm}/clients/{id}/roles/{role-name}/composites

Add a client-level role to the user role mappings, we have

POST /{realm}/groups/{id}/role-mappings/clients/{client}

To get a list of all users

GET /{realm}/users

to get the representation of a user, we have

GET /{realm}/users/{id}

To revoke consents and offline tokens for particular client from users

DELETE /{realm}/users/{id}/consents/{client}

To get all admin-events for a realm, we have

GET /{realm}/admin-events

to get the client registration policy providers with configProperties properly filled, we have

GET /{realm}/client-registration-policy/providers

to add the client-roles to the user role mapping, we have

POST /{realm}/groups/{id}/role-mappings/clients/{client}

Note that the Kubernetes namespaces are not part of the keycloak role representation. Keycloak may or not be hosted on Kubernetes. To use kubectl for enumerating serviceinstance and service bindings, we need to use the proper namespace.

#codingexercise

Count the number of nodes in a circular linked list
Integer count (Node start) {
Integer count = 0;
If (start == null) return count;
count +=1;
If (start.next == start) return count;
Node cur = start;
while (cur.next != start) {
count += 1;
cur = cur.next;
}
return count;
}

Monday, July 22, 2019

Today we enumerate some of the apis from the admin rest api reference for Keycloak. This is helpful for Kubernetes cluster security where users are identified with Keycloak. We assume a deployed instance available at http://localhost:8080/auth.
To clear login failures for all users and release temporarily disabled users, we use:
DELETE /{realm}/attack-detection/brute-force/users
To get the status of a username in brute force detection, we have
GET /{realm}/attack-detection/brute-force/users/{userId}
To get all roles for the realm or client, we have
GET /{realm}/clients/{id}/{roles}
to get a role by name we have
GET /{realm}/clients/{id}/roles/{role-name}
To update a role by role name, we have
PUT /{realm}/clients/{id}/roles/{role-name}
Add a composite to the role, we have
POST /{realm}/clients/{id}/roles/{role-name}/composites
Add a client-level role to the user role mappings, we have
POST /{realm}/groups/{id}/role-mappings/clients/{client}
To get a list of all users
GET /{realm}/users
to get the representation of a user, we have
GET /{realm}/users/{id}
To revoke consents and offline tokens for particular client from users
DELETE /{realm}/users/{id}/consents/{client}
To get all admin-events for a realm, we have
GET /{realm}/admin-events
to get the client registration policy providers with configProperties properly filled, we have
GET /{realm}/client-registration-policy/providers
to add the client-roles to the user role mapping, we have
POST /{realm}/groups/{id}/role-mappings/clients/{client}
Note that the Kubernetes namespaces are not part of the keycloak role representation. Keycloak may or not be hosted on Kubernetes. To use kubectl for enumerating serviceinstance and service bindings, we need to use the proper namespace.

Sunday, July 21, 2019

We were enumerating security pod context policies yesterday. These include the following:
1) a privileged - determines if any container in a pod can be allowed access to devices on the host. A privileged container gives access to all devices on the host.
2) hostPID/hostIPC - controls whether the pod containers can share the host Process ID and IPC namespaces which makes it accessible from outside the container.
3) hostNetwork/hostPort - controls whether the pod may share the node network namespace and have access to the loopback device, listening on localhost or snooping on network activity.
4) allowed HostPaths - specifies a whitelist of host paths where a pathPrefix allows host paths to begin with that prefix only and a readOnly field to indicate no write access.
5) allowed flexVolume - specifies a whitelist of flexvolume drivers when the volume is a flexvolume type. A flexvolume allows vendor specific operations to third party storage backend providers
6) fsGroup - allows groups on volumes with the runAs directive to specify fsGroup ID.
7) readOnlyRootFileSystem - specifies the container to run with no writeable layer.
8) runAsUser and runAsGroup - specifies which user id or group the containers are run with. a non-root enforces least privilege policy
9) Privilege escalation - these options allow the privilege escalation container option.
10) Capabilities - specify linux 'capabilities' which are per thread attributes to specify the permissions available in categories under privileged accounts. A whitelist of capabilities are specified with this list.
11) SELinux - short form for Security Enhanced Linux provides support for the enforcement of different access control policies. Directives using RunAs specify different seLinuxOptions
12) AllowedProcMountType - specifies a whitelist of proc mount types. Most container runtimes mask certain paths in /proc to avoid divulging special devices or information
13) apparmor and seccomp - are annotations for profiles that the containers can run with.
14) forbiddensysctls excludes specific sysctls which can be a combination of safe and unsafe syctls.

Saturday, July 20, 2019

The security of a container depends on the following three items:
1) Role and usage of service accounts
2) Role based access controls
3) Defining the security context of a pod.
Kubectl service account creation is covered earlier. Role based access control helps manage proliferation of user accounts to access secrets. The securing of pods helps with the enforcement of policies such as least privilege policy.
Let us look at some of the security context to apply to pods.
The policies are enumerated as:
1) privileged: this governs the containers to be run as privileged.
2) hostPID: usage of host namespaces
3) hostNetwork: usage of host networking
4) volumes: usage of volume types
5) allowedHostPaths: usage of host paths
6) allowedFlexTypes: usage of flex volumes
7) fsgroup: allocating an FSGroup that owns the pods volume.
8) readonlyrootfilesystem: requires the use of readonly root filesystem.
9) runasuser: the user ID of the container.
10) allowPrivilegeEscalation: restricting escalation to root privileges

Friday, July 19, 2019

When the pods are in trouble, it can be recovered with a liveness probe. The logs for the pods indicate if there were some errors. However, the mitigation to restart the pods cannot always be based on the detection of errors from the logs. This is where a liveness probe helps because it will restart the pod automatically when the probe fails. There are three different types of probes 1) a probe can be a command 2) it can be an HTTP request to a path served by a web server or 3) it can be a generic TCP probe. All types of probes target what is running within the containers.
The traffic flow to a pod can be controlled using a readiness probe. Even if the pods are up and running, we only want to send traffic to them when they are ready to serve requests. The readiness probe also has three different types just like the liveness probe and they can be used one for the other. However, they serve different purposes and should be maintained separately.
The liveness and readiness probes are defined in the containers section of the pod specification. They are denoted by livenessProbe and readinessProbe in the pod deployment yaml specification.
The kubelet on each worker node uses a livenessProbe to overcome ramp-up issues or deadlocks. A service that load-balances a set of pods uses the readinessProbe to determine if a pod is ready and hence should receive traffic. A livenessProbe uses a restartPolicy and so does the readinessProbe when they are used interchangeably but for different purposes.

Thursday, July 18, 2019

Today we continue with our discussion on Kubernetes user accounts and refresh tokens from our earlier post. The refresh token is retrieved from the identity provider's authorization url. Kubectl refreshes the ID token with the help of a refresh token. Kubernetes never uses the refresh token. It is meant to be a secret for the user. A refresh token is generated only once and it can only be passed between the user and the identity provider. This makes it more secure than long lived bearer tokens. It is also opaque so there is no personally identifiable information divulged. The Kubernetes dashboard uses id token and refresh token. It does not have a login system so it requires an existing token. The dashboard has therefore required the use of a reverse proxy which will inject the id_token on each request. The same reverse proxy then refreshes the token as needed. This certainly alleviates the user authentication from the Kubernetes dashboard so much so that it can now be directly included with the user interface of the applications hosted on the Kubernetes system. Most of the panels in the dashboard are read-only so this is very helpful to all users

The identity provider serves two purposes it honors the Open ID connect way of providing identities. As part of that identity, it will need to support discovery which is required to make calls. Second it is required to support the generation of tokens and to inject them into kube configuration. A variety of identity providers can support both these functionalities

Tuesday, July 16, 2019

#codingexercise
Method to classify synonyms:

def classify_synonyms():

words = [{'cat': ['animal', 'feline']}, {'dog':['animal', 'lupus']}, {'dolphin':['fish', 'pisces']}, {'spider':['insect','arachnid']}]

groups = []

for item in words:

if item not in groups:

merged = False

for key in groups:

group = next(iter(key))

for value in iter(item.values()):

if group in value:

index = groups.index(key)

old = iter(groups[index].values())

new = iter(item.keys())

merged = []

for v in old:

merged += v

merged += new

groups[index] = {group:merged}

merged = True

if not merged:

k = next(iter(item.keys()))

v = next(iter(item.values()))

groups.append({v[0] : [k]})

print(groups)

classify_synonyms()

#[{'animal': ['cat', 'dog']}, {'fish': ['dolphin']}, {'insect': ['spider']}]

The above method merely classifies the input to the first level of grouping. It does not factor in multiple matches between synonyms, selection of the best match in the synonym, unavailability of synonyms, unrelated words, and unrelated synonyms. The purpose is just to show that given a criterion for the selection of a group, the words can be merged. The output of the first level could then be taken as the input of the second level. The second level can then be merged and so on until a dendrogram appears.

Given this dendrogram, it is possible to take edge distance as the distance metric for semantic similarity.

Since we do this hierarchical classification only for the finite number of input words in a text, we can take it to be a bounded cost of O(nlogn) assuming fixed upper cost for each merge.

def nlevel(id, group_dict=df.GroupID, _cache={0:0}):

if id in _cache:

return _cache[id]

return 1+nlevel(group_dict[id],group_dict)

df['nLevel'] = df.ID.map(nlevel)

print df[['nLevel','ID','Group']]

It is also possible to apply page rank given that synonyms are connected by edges.

def pagerank (u, constant):

sum = constant

For node-v in adjacencies (u):

Sum += Pagerank_for_node_v (node-v) / number_of_links (node-v)

Return sum