Ingest Freeze for Islandora 7
The migration process is underway and below are the tasks which have been completed thus far related to the migration. We will continue to provide updates as milestones are achieved via Slack and a summary is added to this page.
(January 3, 2022 to May 26, 2023) ✓
Review existing CTDA content, map fields to Modern Islandora, review/address any issues with mapping fields.
(October 1, 2022 to May 31, 2023) ✓
Create new Modern Islandora servers on UConn infrastructure.
(May 1, 2023 to June 1, 2023)
Utilize small batches of CTDA content to test migration workflow, make adjustments, and prepare for larger migration.
(May 31, 2023 to Present)
Migration Processes Underway include:
- Groups being populated with media
- Configuring Groups
(July 14, 2023 - Present)
Review the migration of content, verify settings, and configure facets and display settings. Add logos and other information for groups.
Release site to CTDA members and after final review launch to public. The release will include multiple phases to expedite the roll-out of the new system.
- Implementation of Google Analytics code
- Group configuration (logo, colors, adding members) for Group Admins
- Adding new items via +Add Content (Single Item Ingest)
- Adding new content via Spreadsheet Ingest
- Editing migrated content
- OAI/PMH Harvest Configurations
The following items are on the roadmap to be addressed and/or developed for Modern Islandora and the timeframe for when these features will be available will be provided as the CTDA has more details post-migration.
- Alma/Primo integration of content from the CTDA
- Remove duplicate title entry from +Add Content (Single Item Ingest) form
- Defaulting group landing page to Content Info page rather than the View page
- Enhanced statistics for objects
- Manuscript Collection module
- Improved navigation for newspaper content
- 3D object viewer
August 14, 2023
As of the morning on August 14, 2023 the media migration (datastreams from the i7 system) have been completed. The Solr index will be starting soon and additional review of the migrated items is underway. Derivative generation is being deployed in the production system and CTDA staff are migrating pages (such as documentation) from the CTDA Sandbox to the production system.
August 8, 2023
As of the morning on August 8, 2023 a total of 2,273,339 out of the 3.7 million objects have had all their associated media connected to them in the new system which represents ~ 61% complete. Thanks to the additional processing power temporarily granted to the CTDA by UConn ITS this work is progressing much faster. As we near the last objects to associate their corresponding media, we will see the pace slow down a bit just as we have seen with other stages of the migration. Testing on ingest options is underway by DGI with more details to be released soon once the testing is complete.
August 4, 2023
The association of media to each object within the repository is approximately 50% complete and additional processors were added this past week to further expedite this process. DGI is in the process of preparing to conduct ingest tests on the new system to begin the load balancing process and to prepare for ingests as the migration next steps progress simultaneously. A review of groups in the new system has identified some materials to be reviewed post migration to ensure they are associated with their correct groups. There will be some remaining clean up to do post-migration which will be done using batch processes.
July 28, 2023
The CTDA migration is nearing the final stages and over the past 2 weeks significant progress has been made which has included the migration of all of the associations between namespaces and groups in the new system and as we speak the media (datastreams from the prior system) are being populated and associated with each object in the new CTDA.
The total number of media is ~40 million and the system has been doubled in memory and processors to expedite this work. While these media are being associated to each of the objects in the new system, the actual content is not moving, just the links are being updated. The CTDA staff are actively working on preparing for the group pages for each organization including preparing logos, colors, and identifying the groupID for each organization. The groupID number is the number each organization will have within the SFTP environment to prepare content for upload to the new system via spreadsheet ingest. The CTDA staff are actively exploring ways to customize pages in the new system and this work will be ongoing as we enhance pages to make it easier for content curators to find content, generate statistics, and other functionality that saves time.
July 14, 2023
Group migration continues with a few issues to be resolved as multiple groups were created for the same organization. This review of these objects is underway to associate them with one group and as the cause of these issues are resolved, the scripts resume populating the groups with content. The CTDA Staging Box is beginning to see content that has been migrated but the Solr index has not been run yet so previewing is limited to specific nodeIDs to explore.
July 7, 2023
The creation of the 3.7 million+ nodes (nodeIDs) in Modern Islandora is completed. The population of groups in the system is now underway which involves connecting the media to the groups, connecting collections to the groups, and preparing a shell for each group which will be customized with your organization branding once the migration is complete.
June 29, 2023
Migration of the FOXML has been completed! The containerization of content is underway, with an unexpected delay of approximately 24 hours caused by the unplanned UConn Storrs power and network outage on June 25, 2023. The containerization process enables the existing content from the CTDA to be mapped to the new nodes and this process involves updating the pointers in the system to the object without the need to "move" the objects from one system to another via the network or by copying. This process involves using Kubernetes and will enable greater scale for the repository in future. The creation of the 3.7 million+ nodes (nodesIDs) in Modern Islandora are underway and is anticipated to be complete by the end of the day on June 29, 2023 or in the morning on June 30, 2023. Once the nodes are created, then the next step is to begin configuring the groups, adding user accounts, and adding logos and other branding for the groups.
June 16, 2023
FOXML migration is underway and at the 2 million mark on the way to 3.7 million. To increase the speed of the migration, DGI has rewritten some of the processing tools to facilitate better speeds. Additional processors have been added to the CTDA staging server, 20+ processors, to further reduce the amount of time for the system to complete the steps of the migration. At the CTDA Open Meeting 2023, DGI provided an update on the migration process and all the work underway and already completed to ensure a smooth migration. The recording of this session is part of the CTDA Report and Update session viewing on the agenda page from the CTDA Open Meeting 2023.
June 9, 2023
With the CTDA being the largest migration project to Modern Islandora, there has been a focus on improving the processing speed for the migration which benefits not only the CTDA but also the broader community migrating to Modern Islandora. Migration of content to the new CTDA Staging site is underway with the FOXML being migrated first, then the objects will follow in staging before these are then mapped to the production server. As the objects are not "moving" but rather the database index is being moved, the objects will remain in their current storage arrays and only the connection between the object and the database is changing. Once this process is complete the next phase is to kick off the Solr index which will re-index all 3.7 million+ objects in the repository.
May 31, 2023
On Wednesday May 31, 2023, the next phase of the migration began with the original content being mapped to the new infrastructure. This process involves reindexing and remapping content from the existing CTDA infrastructure (i7) to the new series of Modern Islandora servers which have been allocated additional processors and resources to expedite the indexing process for the 3.7 million objects. This process will likely take a few weeks to complete and once complete, the next phase will be to review everything and prepare the system for production. This is a huge milestone that has been months (and years) in the making and we are very appreciative of everyone's work and patience as we move forward into the next generation of the CTDA with Modern Islandora.
May 26, 2023
As of Friday May 26, 2023, the migration process continues with additional testing, production system configuration, and general troubleshooting. Included below are items which are either in-progress or completed as of May 26, 2023
- Kubernetes for load balancing of the production system is being configured with documentation being developed to provide to UITS for implementation on future projects.
- CTDA Storage for the production system mapping is being planned and tested along with an allocation of 30TB of additional storage to accommodate growth in FY24.
- Multiple staging servers have been configured, additional processors added, all to facilitate the migration of the 3.7 million objects.
- Planning for future growth and expansion of the system was discussed and infrastructure is capable of scaling to well beyond a petabyte of data.
- Mirador viewer is in development and is on target with being implemented in the final production system.
- Group theming has been temporarily disabled to enable faster loading for test data. This functionality will return after these tests are completed.
- Some configuration and performance tests may be conducted with the sandbox so users may notice changes with fields and other configurations temporarily as the system is being tested with migration content.
May 19, 2023
As of Friday May 19, 2023, the migration process is in the metadata review, testing, and system configuration stage of the process. Included below are the items which have been completed from May 1, 2023 to May 19, 2023:
- Review of namespaces and mapping any objects not associated with an official namespace to the correct namespace - 57 objects (completed)
- Review of datastreams which are not official datastreams and mapping to official datastreams - 44 datastreams with 739 objects (completed)
- Review of objects with more than one content model and map to correct content model - 81 objects with 173 total content models (completed)
- RELS typo/error review - 13 objects (completed)
- Empty content model review - 62 objects (completed)
- Not assigned content model review - 28 objects (completed)
- Embargo list for migration - 11,510 objects (completed)
- Scholar embargo list for migration - 411 objects (completed)
- Review of creative commons licenses to ensure valid terminology is utilized in the records to be migrated (completed)
- Configuration of on premises system (underway)
- Load balancing options for on premises system (information gathering phase)
- Storage space mapping (information gathering and testing phase)
- Adding more system resources to DEV box for migration (underway)
- IP embargo configuration and testing (completed)
- Configuration of Sandbox synced to Production and frozen (completed)
May 1, 2023
The Ingest Freeze has begun. No new content may be added to the CTDA repository until the migration is complete in 6-8 weeks. The last elements of the migration process are being tested and test migration batches will begin shortly. The public CTDA server will continue to operate as usual until the migration is complete.
April 27, 2023
After over a year and a half of work in preparing for the migration to the new Modern Islandora system, the time has finally come where we will begin to prepare for the migration of the content from the Islandora 7 system to the new Modern Islandora system. To prepare for this migration, an ingest freeze will need to take place starting on May 1, 2023 to prepare for migrating the content from Islandora 7 to Modern Islandora. This process will take multiple weeks to complete as content is mapped to the new system, indexed, validated, and reviewed.
The Ingest Freeze will likely take place for 6-8 weeks. The CTDA will provide regular updates on this process including any changes to this time period.
The status of the Ingest Freeze will be updated regularly via Slack and via banners on the CTDA manage and the new CTDA sandbox sites. These updates will include both text and a visual identifier (with alt text) to indicate the status of the Ingest Freeze.
Ingest Freeze Status
As the ingest freeze progresses the CTDA logo will begin to be covered in ice and on May 1, 2023 until the ingest freeze is complete, the CTDA logo will be frozen in an ice cube and as the freeze begins to come to an end, the CTDA logo will begin to thaw as the ice melts. Once the freeze is complete the CTDA logo will return have a puddle of water under it.
FAQs about the Ingest Freeze
Below are additional details on the ingest freeze process for contributing institutions.
- When does the Ingest Freeze start?
The Ingest Freeze will start on Monday May 1, 2023.
- What does an Ingest Freeze mean?
During the Ingest Freeze no new content can be added to the Connecticut Digital Archive (CTDA) manage site (manage.ctdigitalarchive.org) and no updates/edits can be made to content already ingested.
- Why do we have to have an Ingest Freeze?
The Ingest Freeze allows a snapshot of all data in the Connecticut Digital Archive to be taken and enables the work of mapping all the 3.5+ million objects in the CTDA to be mapped and indexed in the new system. This freeze ensures the integrity of the data is preserved and enables a faster migration process.
- What do I need to do to help with the Ingest Freeze?
Up until Monday May 1, 2023 you can continue to edit and ingest new content via the CTDA manage site. Other than being prepared to not ingest content as of May 1, 2023 there is no additional effort needed from CTDA contributing institutions.
- How long will the Ingest Freeze take?
The ingest freeze will likely take place for 6-8 weeks. The CTDA will provide regular updates on this process including any changes to this time period.
- Can the public discover and access content within the CTDA during the Ingest Freeze?
Yes! The content in the CTDA will be discoverable and accessible during the Ingest Freeze and migration process through the public site https://ctdigitalarchive.org .
- How will updates on the Ingest Freeze be provided to the CTDA contributing institutions?
We will provide updates via Slack to the CTDA community, include a banner on both the CTDA manage and the new CTDA Sandbox sites on the progress of the CTDA Ingest Freeze and migration process.
- Can I use the new CTDA Sandbox site during the Ingest Freeze?
Yes! The CTDA Modern Islandora Sandbox site will remain available for the CTDA contributing institutions to practice ingests, updates, and explore features that will be part of the new CTDA once the migration is complete. Keep in mind the sandbox is a place to practice and none of the content within the sandbox will be migrated to the new system.
- What can I do during the Ingest Freeze to prepare for the new system?
During the Ingest Freeze is a great time to familiarize yourself with the new CTDA system, review training materials, review documentation, and prepare for ingests in the new system. The CTDA Content Manager Help Center in the sandbox is being continually updated with new information, documentation, and training materials on working with content in the New CTDA.
- Can I login to manage.ctdigitalarchive.org during the Ingest Freeze?
No, during the Ingest Freeze manage.ctdigitalarchive.org will not be accessible for login in and will likely show a time-out or error when you go to this page. This is to ensure the data remains unchanged during the migration process.
- Can I access content via manage.ctdigitalarchive.org during the Ingest Freeze?
No, during the Ingest Freeze manage.ctdigitalarchive.org will be offline to ensure the integrity of the data. If you need to download an object from the repository during the freeze, CTDA staff can assist, just slack us or email us at email@example.com .
- I need a JPEG/JPG version of an image how can I get a better quality image during the Ingest Freeze?
From the production site https://ctdigitalarchive.org navigate to the object you need to download, and in the URL string add the following /datastream/MEDIUM_SIZE to get access to the JPG or if you need the higher resolution version add /datastream/OBJ to download that object. These links are only to be used during this migration phase and are not persistent.
We thank the entire CTDA community for their patience and support during this Ingest Freeze and migration process.