Migrating to Opensearch manually

Migrating with Data Loss

  1. Install Opensearch according to installation instructions.

  2. Transfer settings from elasticsearch.yml file to opensearch.yml.

  3. If necessary, update the MDM settings: org.unidata.mdm.search.cluster.nodes, org.unidata.mdm.search.cluster.address.

  4. Since the data will not be migrated, perform the reindex data operation, and if there are quality rules, perform the reapply data operation.

Important:

  • If the search engine is selected as the storage for auditing, the event log will be lost - it cannot be recovered from the database, even if it is also specified as a storage.

Migrating with Saving Data

Before updading the system, you need to make a cast of Elasticsearch indexes. It is recommended that you perform this step to have a backup of the state of the search node. See the official Elaticsearch documentation for details.

Index Snapshot

  1. Make a snapshot of Elasticsearch according to the "Search Index Backup" instructions.

  2. At this point, perform the update according to the "Migrating with Data Loss" section (see above).

  3. Next, use the data recovery snapshot in opensearch. Specify the repository path in opensearch.yml:

    path.repo: /mount/backups
    
  4. Add the repository to the cluster configuration. Execution example from the console using the curl utility:

    $ curl -XPUT 'localhost:9200/_snapshot/universe_indices_backup' -d '{ "type" : "fs", "settings" : { "compress" : true, "location": "<path to repository directory from settings>/universe_indices_backup" } }'
    
  • If the path to the repository directory was specified in path.repo, it does not need to be included in the above command. In that case, the command would look like this:

    curl -XPUT 'localhost:9200/_snapshot/universe_indices_backup' -d '{ "type" : "fs", "settings" : { "compress" : true, "location" : "universe_indices_backup" }}'
    
  1. Check if the previous step was successful (the command should show the current repository settings):

    $ curl -XGET 'localhost:9200/_snapshot/universe_indices_backup'
    
  2. Close the current indexes:

    $ curl -XPOST 'localhost:9200/_all/_close'
    

The number of index parts (Shards) in the snapshot and in the cluster must match, otherwise the recovery will fail. The requirement does not apply to empty clusters.

  1. In case the name of the current snapshots is unknown, use the command:

    $ curl -XGET 'localhost:9200/_snapshot/_all'
    
  2. Restore the snapshot. Example command for a snapshot named snapshot_1:

    $ curl -XPOST 'localhost:9200/_snapshot/universe_indices_backup/snapshot_1/_resto re'
    

Notes:

The presented variant is applicable also if Elasticsearch is deployed in docker - it is enough to route the local storage and configuration file through the docker volume.

The docker-compose snippet for Opensearch:

volumes:
- mdm-opensearch-data:/usr/share/opensearch/data
- ./hunspell:/usr/share/opensearch/config/hunspell/
- ./opensearch.yml:/usr/share/opensearch/config/opensearch.yml
- ./backups:/mount/backups

Cluster Migration

The presented option implies manual migration of settings and data for each node.

Two options are available:

  • Stop the cluster, perform the migration for each node and start the whole cluster (cluster restart update);

  • Do the process for each node separately - in this case, the search engine will be active during the update (rolling update).

  1. Install Opensearch according to installation instructions, making sure the Elasticsearch directories are not overwritten.

  2. Disable shard allocation:

    curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
    {
      { "persistent": {
        "cluster.routing.allocation.enable": { "primaries"
      }
    }
    '
    
  3. Stop Elasticsearch on one node (rolling update) or all nodes (cluster restart update):

    sudo systemctl stop elasticsearch.service
    
  4. Perform a migration on one node (rolling) or all nodes (cluster restart):

    • Copy Elasticsearch data and logs (contents of data and logs directories) to similar Opensearch directories. For example, sudo cp -r /usr/share/elasticsearch/data /usr/share/opensearch/data && chown -R opensearch:opensearch

    • Port the settings from elasticsearch.yml to opensearch.yml.

  5. Run OpenSearch on one node (rolling) or on all nodes (cluster restart):

    sudo systemctl start opensearch
    
  6. Rolling - repeat steps 3 to 5 until whole cluster goes to OpenSearch.

  7. Turn shard allocation back on:

    curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
    {
     { "persistent": {
        "cluster.routing.allocation.enable": "all"
      }
    }
    '
    

Step 3 can be performed automatically via the opensearch-update utility:

  1. Install Opensearch according to installation instructions.

  2. Disable shard allocation:

    curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
    {
      { "persistent": {
        "cluster.routing.allocation.enable": { "primaries"
      }
    }
    '
    
  3. Make sure environment variables are set: ES_HOME, ES_PATH_CONF, OPENSEARCH_HOME, OPENSEARCH_PATH_CONF.

  4. Execute the utility as the user who started Elasticsearch:

    /usr/share/opensearch/bin/opensearch-update
    
  5. Stop Elasticsearch on the node:

    sudo systemctl stop elasticsearch.service
    
  6. Start Opensearch on the node:

    sudo systemctl start opensearch
    
  7. Repeat steps 3-6 until you completely switch to Opensearch.

  8. Enable shard allocation back:

    curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
    {
     { "persistent": {
        "cluster.routing.allocation.enable": "all"
      }
    }
    '
    

Saving Audit Data

To save audit data when you migrate to Opensearch, follow these steps:

  1. Stop the application if it is running so that no changes are made to the data.

  2. Create the default_default_default_audit read-only index using the following query:

    PUT http://OPEN_SEARCH_HOST:OPEN_SEARCH_PORT/default_default_audit/_settings
    
    {
      "settings": {
        "index.blocks.write": true
      }
    }
    
  3. Clone the default_default_default_audit index with data into a temporary index default_default_default_audit_old_index_copy using query:

    PUT http://OPEN_SEARCH_HOST:OPEN_SEARCH_PORT/default_default_audit/_clone/default_default_audit_old_index_copy/
    
  4. Verify that the default_default_default_audit_old_index_copy index has been created and contains all the required data.

  5. Delete the default_default_default_audit index using the query:

    DELETE http://OPEN_SEARCH_HOST:OPEN_SEARCH_PORT/default_default_audit
    
  6. Start the server with the application and wait for the run to complete. A new index default_default_default_audit should be created.

  7. Copy the data from the default_default_default_audit_old_index_copy index copy to the new default_default_default_audit index.

    • If there is a lot of data in the log, the copying may take a long time. The data will be gradually moved to the log, system operation will be available at this time.

  8. Execute the query:

    POST http://OPEN_SEARCH_HOST:OPEN_SEARCH_PORT/_reindex/
    
    • The body of the request must contain the following JSON:

      {
        "source": {
          "index": "default_default_audit_old_index_copy"
        },
        "dest": {
          "index": "default_default_audit"
        }
      }
      
  9. Verify that the data has been moved to the default_default_audit index.

  10. Go to "Audit Logs" and check to see if there are any records in the log before the update.

  11. Optional: If the steps are correct, you can delete the default_default_default_audit_old_index_copy index copy using the following query:

    DELETE http://OPEN_SEARCH_HOST:OPEN_SEARCH_PORT/default_default_audit_old_index_copy