System architecture and requirements / Scale the ELMA365 Enterprise cluster

Scale the ELMA365 Enterprise cluster

The diagram in this article displays the protocols and ports used for communication between the servers in ELMA365. The information provided on the diagram can also be used for deploying and scaling ELMA365 fault-tolerant clusters.


Data storage services (PostgreSQL, MongoDB, RabbitMQ, Redis, MinIO S3 Deployment) have specific rules of deployment and scaling. You can read more in the Fault-tolerant server architecture section of the System Requirements for ELMA365 Enterprise edition article.


ELMA365 consists of multiple microservices isolated in containers all managed by Kubernetes container orchestrator (the current ELMA 365 On-premises distribution configured to use MicroK8s package, by default).



ELMA365 cluster internal services



Authorization and groups, users, and org chart management


Multi-tenancy management


Calculating different values in-app items


Private and group messages


App items read and filter


Office formats to pdf converting


Migration management


Files and directories


Document workflow. Approval and review.


System events monitoring and processing via events bus


Activity stream / channels


ELMA365 Front end


External integrations


Email management


API gateway


Live chats and external messengers management


Notifications and web-sockets


Process management


EDS fingerprint management


Schedules, tasks delayed start, time reports


User profile and system profile management


Text and office documents templater


Users management in a multi-tenancy system


Web-forms for external systems management


Widget management (interface customization)


Validating, transpiling, and user scripting


* - the service cannot be executed in multiple instances


Fault tolerance and scaling of individual services


Distributed Services Architecture allows the solution to scale more flexibly depending on the load profile of the system. By default, all cluster services (except scheduler, notifier, messengers) in the ELMA365 Enterprise cluster are performed in two instances (for ELMA365 Standard edition, in a single instance).


If the cluster combines multiple servers, the microk8s orchestrator tries to evenly distribute the instances of the services among the servers. If either server in the cluster goes down, the orchestrator determines the services interrupted by the failure and replicates them on another server.


It is recommended to run the ELMA365 cluster on at least three nodes to ensure fault tolerance. 

Nodes work together, and the cluster permanently checks if all of them are up and running. The server is considered to be “down” if it is unreachable through a network for a specified time. The cluster is fully functional until at least two nodes are up and running. 


Adding servers without proper cluster reconfiguration will not automatically solve the problem of increased load. This approach only decreases the chance of service global failure. To leverage cluster throughput you can upgrade any distinct server hardware to a relatively great extent ( handling up to 10000 simultaneous transactions per server).


However, often we can identify a workload bottleneck and eliminate it by scaling the corresponding service. Let’s take a look at the worker script executing service. This service is resource-hungry but runs in parallel very well. If your server is heavily used for server-side script execution then it is reasonable to configure a high replication factor for this service (above 2). The orchestrator will then create additional instances of the service to perform the scripts faster in parallel.


In the same way, you can configure multi-instances parallel execution for any other service. However, to understand what services should be scaled, it is necessary to analyze the workload profile of a particular configuration at a specific time.

Found a typo? Highlight the text, press ctrl + enter and notify us