
GitLab has certain access control measures in place, one of these is the different levels of access a user can have for a particular group or subgroup. These can be configured manually on an individual basis in the UI, but there is also a way to not only automate this process, but to link GitLab group/subgroup membership with pre-existing Identity Management (IDM) group membership. This article discusses how we at cloudWerkstatt have implemented this automated solution of GitLab SAML group links, and how you can do the same.
What is SAML?
Security Assertion Markup Language (SAML) is an open standard of authentication based on Extensible Markup Language (XML). The purpose of SAML is to transfer authentication data between an identity provider (IdP) and a service provider (SP), so that users only need one set of credentials to log in to multiple web applications (you can, in theory, also use SAML for native applications, however this is not the intended use to SAML and this thus quite cumbersome to implement). This authentication data is known as a SAML assertion.
What is IDM?
Identity Management (IDM) is the means of managing digital identities and falls under the umbrella of Identity and Access Management (IAM). More on identities will be covered in the paragraph “What is RBAC?“. These identities need to be centralised for maintenance or monitoring, hence an IDM system is used for this purpose. At cloudWerkstatt, we use Red Hat Identity Management as our IDM system, with this application being referred to as IDM in the remainder of this article.
What is Keycloak?
Keycloak is an open-source IAM tool developed by Red Hat and is based on several different protocols, including SAML. Some of Keycloak’s features include supporting Single-Sign On (SSO) to SPs as well as the ability to authenticate and authorise users. User information can be pulled from potentially multiple IdPs. Keycloak also supports login to SPs via social networks such as Facebook, and can federate users by connecting to existing LDAP (Lightweight Directory Access Protocol) or AD (Active Directory) servers.
What is Ansible?
Ansible is an open-source tool for IT automation. It abstracts Python code into YAML files for easier readability and accessibility as users do not need to write Python code directly. These files can then be run on an ad-hoc basis or on a schedule, improving automation processes. For more information on Ansible, you can view our article Introduction to Ansible.
What is RBAC?
Role-Based Access Control (RBAC) is means of dictating which application resources a user can access based on their associated role. Within an organisation, a user will be referenced via an identity. This identity itself can have certain permissions assigned to it, determining what the user can and cannot access, but what is better practice is to have these permissions assigned to a role. Individual users are then assigned to the role rather than the permissions themselves, thereby allowing multiple users to adopt the same permissions in a centralised fashion. Users can be assigned to one or multiple roles, yet it is important to consider the rule of least privilege, ensuring that users only have the minimum permissions needed to fulfill their duties. RBAC and least privilege is paramount for any organisation as it enforces security and privacy.
What are Group Links?
Group Links (as part of SAML) are a means of mapping user groups from IdPs to SPs. This is an implementation of RBAC and allows security policies to be defined in one IdP but be enforced across multiple web applications.
How It All Fits Together
When all of these are combined, a system is created that allows federated users to access specific resources on desired web applications. Since this article relates to how we have applied this to GitLab, consider the following process:
- A user attempts to log in to GitLab.
- The user is directed to Keycloak’s authentication page and provides their credentials.
- If authentication is successful, Keycloak retrieves user information, including IDM group membership, after having been synchronised from IDM. This information forms part of the SAML assertion.
- GitLab receives the SAML assertion, extracts the user information, and applies access controls to the user by way of corresponding GitLab SAML group links.
- Thus the user gains access to GitLab but only to the groups and subgroups that correspond to the groups in IDM.
In our implementation, IDM is the IdP and GitLab is the SP, meaning SAML assertions connect IDM with GitLab, while Keycloak provides the SSO page for users to log in to GitLab. Groups are created in IDM and users are added to those groups. This group membership is passed as part of the SAML assertion to GitLab upon login via Keycloak, and it is SAML group links within GitLab that then determine which groups/subgroups the user can access based on the received group membership. These SAML group links are initially configured via Ansible, where IDM group data is passed to the GitLab API. However, the Ansible playbook only needs to be run again if new IDM groups or SAML group links are needed. If users are added/removed from groups in IDM, only a Keycloak sync is required to update the content of the SAML assertions at login.
Example Configuration
The role to configure SAML group links in GitLab is referenced by a playbook:
- name: Sync SAML groups to GitLab group with access levels
gather_facts: false
roles:
- gitlab_saml_group_link
vars:
gitlab_saml_group_link_api_version: "v4"
gitlab_saml_group_link_api_validate_certs: true
gitlab_saml_group_link_gitlab_host: "git.example.com"
gitlab_saml_group_link_suffix_access_mapping:
guest: 10
reporter: 20
developer: 30
maintainer: 40
owner: 50
gitlab_saml_group_link_keycloak_host: "example.keycloak.com"
gitlab_saml_group_link_keycloak_realm: "example_realm"
gitlab_saml_group_link_gitlab_access_token: ""
gitlab_saml_group_link_keycloak_username: ""
gitlab_saml_group_link_keycloak_password: ""
gitlab_saml_group_link_gitlab_groups:
- saml_group_name: cw_gitlab_saml_archived-developer
gitlab_group_name: Archived
access_level: developer
- saml_group_name: cw_gitlab_saml_archived-maintainer
gitlab_group_name: Archived
access_level: maintainer
- saml_group_name: cw_gitlab_saml_archived-owner
gitlab_group_name: Archived
access_level: owner
- saml_group_name: cw_gitlab_saml_cloudwerkstatt-developer
gitlab_group_name: cloudWerkstatt
access_level: developer
- saml_group_name: cw_gitlab_saml_cloudwerkstatt-maintainer
gitlab_group_name: cloudWerkstatt
access_level: maintainer
- saml_group_name: cw_gitlab_saml_cloudwerkstatt-owner
gitlab_group_name: cloudWerkstatt
access_level: owner
All our variables have been placed in the playbook definition for easier readability. In practice, however, most of these are stored in our corresponding Anisble inventory.
The role itself is divided into two parts: first it extracts the desired groups from Keycloak, then it applies SAML group links using these group names. Let’s break down the role into individual tasks.
# tasks file for gitlab_saml_group_link
- name: Get access token from Keycloak
ansible.builtin.uri:
url: "https://{{ gitlab_saml_group_link_keycloak_host }}/realms/master/protocol/openid-connect/token"
method: POST
headers:
Content-Type: "application/x-www-form-urlencoded"
body:
client_id: "admin-cli"
username: "{{ gitlab_saml_group_link_keycloak_username }}"
password: "{{ gitlab_saml_group_link_keycloak_password }}"
grant_type: "password"
body_format: form-urlencoded
return_content: true
register: __gitlab_saml_group_link_keycloak_token_response
no_log: true
delegate_to: localhost
- name: Extract access token
ansible.builtin.set_fact:
gitlab_saml_group_link_keycloak_access_token: "{{ __gitlab_saml_group_link_keycloak_token_response.json.access_token }}"
no_log: true
delegate_to: localhost
Before we can retrieve the groups from Keycloak, we need to be able to reach Keycloak itself. The above two tasks gather an OpenID token and then extract the specific access token to be used in later API calls (OpenID Connect, shortened to OIDC, is another authentication protocol similar to SAML, but is built upon the Open Authorization, OAuth 2.0, framework).
- name: Get list of groups from Keycloak
ansible.builtin.uri:
url: "https://{{ gitlab_saml_group_link_keycloak_host }}/admin/realms/{{ gitlab_saml_group_link_keycloak_realm }}/groups"
method: GET
headers:
Authorization: "Bearer {{ gitlab_saml_group_link_keycloak_access_token }}"
Content-Type: "application/json"
return_content: true
register: __gitlab_saml_group_link_keycloak_groups_response
delegate_to: localhost
This task is to obtain all the groups stored in Keycloak, no matter where they may originate. For us, these groups are defined in IDM, but you may have a different or potentially multiple sources where identity groups are defined.
- name: Filter Keycloak groups with the cw_gitlab_saml_* prefix
ansible.builtin.set_fact:
gitlab_saml_group_link_filtered_keycloak_groups: "{{ __gitlab_saml_group_link_keycloak_groups_response.json | selectattr('name', 'search' '^cw_gitlab_saml' + '.*') | map(attribute='name') | list }}"
when: __gitlab_saml_group_link_keycloak_groups_response | length > 0
delegate_to: localhost
Since we have many groups in Keycloak that are unrelated to GitLab, we decided to add a prefix to all GitLab groups. This particular task filters groups with our GitLab prefix so that only GitLab groups are considered for the following tasks. You may have a more appropriate prefix you can use for your implementation, or you may not need this filtering task at all.
- name: Ensure groups exists for Keycloak group
ansible.builtin.fail:
msg: "No groups found in Keycloak: {{ gitlab_saml_group_link_filtered_keycloak_groups }}"
when: gitlab_saml_group_link_filtered_keycloak_groups | length < 1 ``` The variable `gitlab_saml_group_link_filtered_keycloak_groups` is a reference to our inventory (more on this later). This task is to compare the groups obtained from Keycloak with those defined in our inventory, ensuring both sources of information match. If there are any discrepancies, this task will notify the user. ``` - name: Include SAML group mapping tasks ansible.builtin.include_tasks: map_saml_group.yml loop: "{{ gitlab_saml_group_link_filtered_keycloak_groups }}" loop_control: loop_var: gitlab_saml_group_link_one_filtered_keycloak_group when: - gitlab_saml_group_link_filtered_keycloak_groups | length > 0
This task leads on to the second part of the role: the implementation of SAML group links in GitLab.
- name: Loop through Keycloak results and create SAML group links
block:
- name: Ensure inventory variable gitlab_saml_group_link_gitlab_groups is populated
ansible.builtin.fail:
msg: "No content found in inventory: {{ gitlab_saml_group_link_gitlab_groups }}"
when: gitlab_saml_group_link_gitlab_groups | length < 1 ``` This task relates to our inventory content. Let's sidestep briefly into the inventory and take a look at `gitlab_saml_group_link_gitlab_groups`: ``` gitlab_saml_group_link_gitlab_groups: - saml_group_name: cw_gitlab_saml_archived-developer gitlab_group_name: Archived access_level: developer - saml_group_name: cw_gitlab_saml_archived-maintainer gitlab_group_name: Archived access_level: maintainer - saml_group_name: cw_gitlab_saml_archived-owner gitlab_group_name: Archived access_level: owner - saml_group_name: cw_gitlab_saml_cloudwerkstatt-developer gitlab_group_name: cloudWerkstatt access_level: developer - saml_group_name: cw_gitlab_saml_cloudwerkstatt-maintainer gitlab_group_name: cloudWerkstatt access_level: maintainer - saml_group_name: cw_gitlab_saml_cloudwerkstatt-owner gitlab_group_name: cloudWerkstatt access_level: owner ``` `gitlab_saml_group_link_gitlab_groups` is a list of dictionaries. Each dictionary contains the keys `saml_group_name`, `gitlab_group_name`, and `access_level`. The `saml_group_name` is the name of the group in IDM (and thus also in Keycloak). The `gitlab_group_name` is how the GitLab group is referenced via URL, i.e. `git.example.com/Archived`. Finally, the `access_level` refers to the level of privileges a user should have when part of this GitLab group. These string values (e.g. developer) correspond to integer values used by GitLab to determine access levels. These are defined in the variable below: ``` gitlab_saml_group_link_suffix_access_mapping: guest: 10 reporter: 20 developer: 30 maintainer: 40 owner: 50 ``` To aid readibility, our IDM groups each end with a suffix to determine the access level, thus the suffix can be extracted, compared with that in `gitlab_saml_group_link_suffix_access_mapping`, and the integer value returned for GitLab API calls. Depending on your circumstances, you may wish to use all of these access levels or just those relevant to you. See the official GitLab documentation on [permissions](https://docs.gitlab.com/user/permissions/) and [setting access with the API](https://docs.gitlab.com/api/access_requests/) for more information on these topics. Now going back to the role, the task above ensures that the `gitlab_saml_group_link_gitlab_groups` variable has been defined and has at least one element, ensuring that we have determined which IDM group corresponds with which GitLab group, as well as which access level should be granted to users in this group. ``` - name: Filter GitLab group entry for the current Keycloak group ansible.builtin.set_fact: gitlab_saml_group_link_current_mapping: >-
{{
gitlab_saml_group_link_gitlab_groups | selectattr('saml_group_name', 'equalto' gitlab_saml_group_link_one_filtered_keycloak_group) | list | first | default({})
}}
when: gitlab_saml_group_link_one_filtered_keycloak_group is defined
delegate_to: localhost
- name: Ensure mapping exists for Keycloak group in inventory
ansible.builtin.fail:
msg: "Keycloak SAML group: {{ gitlab_saml_group_link_one_filtered_keycloak_group }} not found in inventory"
when: gitlab_saml_group_link_current_mapping == {}
These tasks identify which element(s) in our inventory list matches the values obtained from Keycloak by comparing the names of the groups. If a match is identified, the group name is added to the gitlab_saml_group_link_current_mapping
dictionary to be used later in the role. If no matches are identified, an error is thrown gracefully.
- name: Map groups to access levels
ansible.builtin.set_fact:
gitlab_saml_group_link_groups_with_access_levels: >-
{{
gitlab_saml_group_link_groups_with_access_levels | default([]) + [{
"group_name": gitlab_saml_group_link_current_mapping.saml_group_name,
"access_level": gitlab_saml_group_link_suffix_access_mapping.get(gitlab_saml_group_link_current_mapping.access_level)
}]
}}
when: gitlab_saml_group_link_one_filtered_keycloak_group is defined
delegate_to: localhost
Here we prepare some of the information to send to the GitLab API which expects a group_name
and an access_level
. This task maps the name of the group to group_name
and the integer value corresponding to the string suffix of the group name as the access_level
. This mapping is then added to the list gitlab_saml_group_link_groups_with_access_levels
.
- name: Retrieve GitLab group ID for the mapped GitLab group
ansible.builtin.uri:
url: "https://{{ gitlab_saml_group_link_gitlab_host }}/api/{{ gitlab_saml_group_link_api_version }}/groups?search={{ gitlab_saml_group_link_current_mapping.gitlab_group_name }}"
method: GET
headers:
PRIVATE-TOKEN: "{{ gitlab_saml_group_link_gitlab_access_token }}"
register: __gitlab_saml_group_link_gitlab_group_search_response
when: gitlab_saml_group_link_current_mapping.gitlab_group_name is defined
delegate_to: localhost
More information is prepared for the API call. The GitLab API determines groups by their ID, thus this task obtains an API response which contains group IDs.
- name: Filter top-level GitLab group ID from search results
ansible.builtin.set_fact:
gitlab_saml_group_link_gitlab_group_id_list: >-
{{
__gitlab_saml_group_link_gitlab_group_search_response.json |
selectattr('full_path', 'equalto', gitlab_saml_group_link_current_mapping.gitlab_group_name) |
map(attribute='id') |
list
}}
when: __gitlab_saml_group_link_gitlab_group_search_response.json is defined
delegate_to: localhost
This API response is then filtered to extract the ID of our specific group. We found this was needed due to our GitLab instance having many nested subgroups, and so a subgroup that is part of the group in question would be returned instead, thus the SAML group link would be applied to the incorrect group. By using the full_path
value in the API response, we found we can clearly specify the group and so obtain the correct group ID. More on the use of full_path
will be covered in Considerations. You will notice that the ID is added to a list variable rather than an integer variable. This choice is made clear in the tasks below.
- name: Validate GitLab group ID is not empty
ansible.builtin.fail:
msg: gitlab_saml_group_link_gitlab_group_id_list is empty. No value was found in the inventory that matches {{ __gitlab_saml_group_link_gitlab_group_search_response.json | selectattr('full_path') }}
when: gitlab_saml_group_link_gitlab_group_id_list | length < 1 - name: Validate GitLab group ID result ansible.builtin.fail: msg: "Found two or more GitLab group IDs for '{{ gitlab_saml_group_link_current_mapping.gitlab_group_name }}'. Results: {{ __gitlab_saml_group_link_gitlab_group_search_response.json }}. IDs: {{ gitlab_saml_group_link_gitlab_group_id }}" when: gitlab_saml_group_link_gitlab_group_id_list | length > 1
These two tasks provide error handling in the case that 1) no ID can be extracted from the API response, resulting in the role breaking, or 2) multiple IDs can be extracted, resulting in the SAML group link potentially be applied to the incorrect GitLab group. Using a list to hold the GitLab group ID helps identify how many IDs have been returned, and with graceful failing, only when there is exactly one ID in the list will the role continue.
- name: Extract and convert GitLab group ID to integer
ansible.builtin.set_fact:
gitlab_saml_group_link_gitlab_group_id: >-
{{
(gitlab_saml_group_link_gitlab_group_id_list | first) | int
}}
when: gitlab_saml_group_link_gitlab_group_id_list | length > 0
delegate_to: localhost
Since the group ID from the API response is a string, it needs to be converted into an integer for easier processing.
- name: Assign GitLab SAML group link based on IDM groups
ansible.builtin.uri:
url: "https://{{ gitlab_saml_group_link_gitlab_host }}/api/{{ gitlab_saml_group_link_api_version }}/groups/{{ gitlab_saml_group_link_gitlab_group_id }}/saml_group_links"
method: POST
headers:
PRIVATE-TOKEN: "{{ gitlab_saml_group_link_gitlab_access_token }}"
Content-Type: "application/json"
body: "{{ {'saml_group_name': item.group_name, 'access_level': item.access_level} | to_json }}"
status_code: 201
loop: "{{ gitlab_saml_group_link_groups_with_access_levels | selectattr('group_name', 'equalto', gitlab_saml_group_link_current_mapping.saml_group_name) | list }}"
when:
- gitlab_saml_group_link_gitlab_group_id | length > 0
- gitlab_saml_group_link_current_mapping.saml_group_name is defined
register: __gitlab_saml_group_link_gitlab_saml_assignments
delegate_to: localhost
failed_when: false
ignore_errors: true
This is the primary task of the whole role: assigning the SAML group links. In the previous tasks, IDM groups have been mapped to GitLab groups that share the same name, desired access levels have been set, and now this task brings everything together to call the GitLab API and apply the SAML group links. Notice in the url
parameter, the specific ID is used to reference the desired GitLab group, and in the body
parameter, our prepared information of IDM group name and the access level integer is applied. In the final line of this task, errors are deliberately ignored. This is justified in the tasks below:
- name: Catch all errors and inform user if SAML group link already exists
ansible.builtin.fail:
msg: >-
{% if item.status == 400 and item.json.message == "Saml group name has already been taken" %}
SAML group link already exists for {{ item.item.group_name }} and was skipped.
{% else %}
{{ item.json.message }}
{% endif %}
loop: "{{ __gitlab_saml_group_link_gitlab_saml_assignments.results }}"
when: item.status != 201
It is possible that when running this role, SAML group links with the same name and access level may already be configured for the GitLab group. This would return a 400
response code from the API, hence the above task provides the user with the name of the duplicate SAML group link group if indeed a 400
response code is received. However, the role might fail in other ways, thus the error JSON message is provided for debugging purposes. Since this task should catch all error messages, it means the role will always fail gracefully if there is an error.
Why Use SAML Group Links?
With the increasing number of specific applications for specific purposes, it can be difficult to keep track of all this disparate information. Centralising this information allows for consistency across multiple applications, as well as improved management due to everything residing under a single domain. GitLab SAML group links enforce this consistency by having GitLab user group/subgroup membership defined in a centralised application, the IdP. It also means that GitLab user access can potentially be reflected in other applications via the IdP. For example, rather than assigning auditor privileges to users in many separate applications (including GitLab), users can be added to a single IdP auditor group and have these permissions mirrored across multiple applications via SAML group links.
Considerations
In relation to the implementation described in this article, there may be some things you need to consider for your own implementation:
IDM Group Name Filter
As stated earlier, our Keycloak instance contains groups that are unrelated to GitLab. By adding a prefix to GitLab-related groups in IDM, only relevant groups are considered by our Ansible role. This may be beneficial to you if your Keycloak instance contains a wide variety of groups, or you may find this irrelevant if you only use Keycloak for GitLab groups.
Access Level Group Name Suffixes
Similar to above, we added suffixes to IDM group names so it is clear which IDM group corresponds to which access level in GitLab. We decided to use suffixes that match the official GitLab documentation, yet for your particular implementation, you may choose more appropriate suffixes for your needs. Just ensure that you are consistent for both the gitlab_saml_group_link_gitlab_groups
and the gitlab_saml_group_link_suffix_access_mapping
variables.
full_path
Attribute for GitLab Group Filtering
Due to our GitLab instance containing many nested subgroups, often with the same names but in different locations, in early versions of this role, SAML group links were being applied to the most deeply nested subgroup. This did not meet our needs as we wanted to be able to apply SAML group links to any group and at any level in the group tree. We found the best way to identify a specific group was with the full_path
attribute returned by the GitLab API when requesting GitLab group IDs. Using the previous example of git.example.com/Archived
, Archived
is the value of full_path
. If our GitLab structure looks like the following:
git.example.com
├── Archived
├── Ansible
│ ├── Test
│ │ └── Documentation
│ │ └── SAML
│ └── Production
└── Documentation
Then the Documentation
subgroup within Test
will have the URL git.example.com/Ansible/Test/Documentation
. If we want this particular group to have a SAML group link, then in the gitlab_saml_group_link_gitlab_groups
list, the gitlab_group_name
value must be Ansible/Test/Documentation
rather than just Documentation
to match the group’s URL. If the gitlab_group_name
was only Documentation
, then the SAML group link would be applied to the group with the URL git.example.com/Documentation
. It is important to keep this in mind when adding new dictionaries to the gitlab_saml_group_link_gitlab_groups
list to prevent incorrectly applied SAML group links.
Conclusion
GitLab SAML group links are a great way to federate user access to GitLab groups and subgroups without having multiple sources of truth. Groups defined in an Identity Management system can be passed to Keycloak to create SAML assertions, and these can be used to determine which groups and at which access level a particular user can interact with. Assigning SAML group links to GitLab groups can also be done via automation, reducing manual workload and ensuring consistency.
Sources
- www.onelogin.com/learn/saml
- www.saviynt.com/glossary-listing/identity-management-idm
- www.bayoglubatuhan.medium.com/what-is-keycloak-what-is-behind-of-the-keycloak-how-does-keycloak-work-d1be308d3227
- www.keycloak.org/
- about.gitlab.com/blog/2023/09/14/the-ultimate-guide-to-enabling-saml/#configuring-saml-group-sync-with-saml-group-links
- docs.gitlab.com/user/group/saml_sso/group_sync/
- www.microsoft.com/en-us/security/business/security-101/what-is-openid-connect-oidc
- www.identityserver.com/articles/why-you-wouldn-t-use-saml-in-a-spa-and-mobile-app
- kvonkonigslow.medium.com/integrating-a-saml-authenticated-service-with-a-net-desktop-application-using-webview2-b4b3a6c263da
- www.ibm.com/think/topics/rbac