Find Communities by: Category | Product

msod-community-page.pngIf you are looking for details on the offering from OnDemand, be sure to read through the Services Overview and Security white papers just released.  Links are available on the following landing page:

 

EMC Managed Services OnDemand

 

 

You'll find detailed descriptions of the catalog of applications, patch management schedule, backup and disaster recovery procedures, service level objectives, and the "secure to the core" governance and security model.

Identity Management for On-Premise Applications


Our industry today has some very proven technologies for providing a single set of login credentials to applications installed on-premise.  Most commonly, companies use a central Identity Management system (e.g. Microsoft Active Directory/Oracle Internet Directory/IBM Tivoli), and these systems implement an LDAP interface that 3rd party applications can call to validate user credentials.


idp_small.png

This allows end users to login to their internal HR portal, SharePoint site, or local Documentum Webtop with the same credentials they used to gain entrance into their Windows Desktop, and is termed SSO (Single Sign-On).  This has dramatically improved the end user experience, as well as improved the ability of IT to mange the risk and policies surrounding identity management.

 

New hires, promotions, and role changes means that the central identity system is continually updated.  Transparent to end users, third party applications query the Identity Management system periodically to find new users, change authorization based on group membership, and deactivate users when they no longer require access.

 

Federated Identity Management for Cloud

 

Fast forward a few years, and applications that at one time seemed anchored to the local datacenter because of data security concerns are now being transferred to the Cloud at a generous pace.  IT groups are being forced to consider how to provide Identity Management not just to a few established SaaS applications, but to an ever growing number of critical external business applications.

 

handshake_small2.gif

What companies need is a way to establish a trust relationship and contract between themselves (the Identity Provider), and external applications/services (Service Provider).  This is where the concept of Federated Identity Management finds its purpose.  The Service Provider (EMC OnDemand) is told that it can absolutely trust a certain host on the Identity Provider side (e.g. your Active Directory FS) to vouch for a user's validity.  So when the IdP says the user is "johnsmith", the Service Provider takes that as truth and allows this user into the web application as "johnsmith", no questions asked.

 

That's a lot of trust.  How is that established?

 

The first thing to understand is that FIM (Federated Identity Management) is more than just a technical exercise.  As mentioned above, trust is a key component on both sides and the traditional vetting done in any business relationship is still required: industry reputation, age of business, phone meetings, email validation, etc...  Additionally, certifications such as SSAE SOC and PCI can help organizations determine a Service Providers' compliance level with respect to data policy, privacy, and auditing.

 

On the technical side, FIM requires the exchange of certificates so that a Circle of Trust is established and the IdP and SP can securely exchange messages and validate identity.  There are several standards for federated logon, but SAML (Security Assertion Markup Language) is one of the most common in the enterprise.  This standard defines how an end user is directed through the process of providing credentials, and then redirected back to the target application.

 

EMC OnDemand Implements Federated Identity and SSO

 

EMC Documentum products like Webtop have always offered Single Sign-On capability for local on-premise installs.  The EMC OnDemand team wanted to raise the bar and deliver not only the basic SSO experience, but silent SSO, where end users do not ever have to enter credentials - they simply go to the application URL, and are sent straight to the main application screen.

 

Using the SAML 2.0 Web Browser SSO Profile, along with an Identity Provider configured to use Kerberos, silent SSO is now an option for EMC OnDemand users.  The IdP uses Kerberos solely on the IdP side for authentication and then sends a SAML Assertion to the Service Provider where it is consumed and validated.  Using this flow, the OnDemand team has implemented Silent SSO for the following applications:

 

  • Webtop 6.7 SPx
  • EPFM 1.x
  • xCP 2.x
  • D2 4.x

OnDemand-SAML-ProcessFlow-v0.2-ECN.png

Note that authentication using LDAP or SiteMinder over our encrypted site-to-site VPN connection is still a perfectly legitimate option for customers wishing to leverage their current infrastructure.  However, protocols like Kerberos and NTLM will not work across the VPN  because they cross domains.

 

 

User Provisioning and Synchronization

 

Once you solve federated login, it's tempting to stop and call it a day, but there is still the issue of user provisioning, roles, and synchronization.  Employees will change roles, leave the company, get married and change their name, or move offices and all these might affect their access in Documentum applications.  The SAML assertion only sends critical metadata, and isn't enough to keep the dm_user objects valid.

 

There are two ways to do this.  The first is to simply call the out-of-the-box LDAP Synchronization job available from Documentum for traditionally local deployments.  Based on querying your LDAP enabled Identity Server, it can automatically create new users, mark users inactive, rename users, and change their groups.  This requires that you allow LDAP communication over the VPN connection between OnDemand and the on-premise datacenter.  This channel is very secure, and there is no external internet access from your OnDemand vCube to the public internet.

 

The second way is to capture every critical event in your Identity Management system (new user, updated user, move user, etc.), and you then send it to the repository via a DQL statement (update user or alter group statements with DFC/DFS).  This would be too labor intensive and error prone to do manually, and you would most likely write a small utility to capture these Identity Server events and push changes to the repository.  But this approach does not make much sense when the out-of-the-box LDAP Synch from Documentum already has more than a decade of development and troubleshooting behind it.

 

There are emerging standards such as SPML and the even more promising SCIM, that may lead to a non-proprietary synchronization interface, but adoption has not started yet and the other big providers like Salesforce, Google, and Webex still rely on proprietary solutions.  So we strongly suggest LDAP Synch at the moment, with a long term view toward the emerging industry standards.

rick-devenuti.pngEMC OnDemand was a clear force at EMC World 2013.  IIG has a solid cloud and managed service strategy, and it always great to see these elements pushed to public audiences as a principal facet of Rohit Gai's Keynote and Rick Devenuti's keynote and backstage interviews.

 

I think what I found personally most encouraging is that our customers instantly get it.  The benefits of allowing EMC OnDemand to design, manage, monitor, and patch your IIG stack is recognized by everyone who currently holds these responsibilities.  The promise of allowing the customer to focus on their core business processes and data, while allowing OnDemand to manage the infrastructure using economies of scale just makes sense.

 

I spent most of my time in the Solutions Pavillion manning the Genius Lab, and two of the most interesting things I heard from dozens of converations is that first of all, customers are in no way territorial about managing their IIG stack. In fact, customers welcomed the fact that they would be freed from the plumbing and could instead use their time to support their end users and their business goals.

 

The other interesting thing I heard from our North American customers is that our on-premise solution (where the OnDemand product sits in the customer datacenter, but is managed remotely by the EMC OnDemand Support team), is a very attractive starting point given their organization's comfort level with cloud.  But whether on-premise or deployed in one of our global datacenters, the OnDemand management functionality remains the same.

 

Finally, for all of our current customers who have decades of data/policies/workflows and questioned how that could be moved to the OnDemand platform, the newly published EMC Migration Appliance (EMA) was the right solution at the right time.  It has the ability to do live and incremental migration at the database level, retaining objectId and running workflows until a scheduled cutover time, so it fits the OnDemand migration scenario perfectly, and in fact we have already used it to bring customers onto our platform.

 

My EMC World/Momentum technical presentation, "EMC OnDemand: Enterprise Class Cloud", is recorded and captured at the emcworld site, just click on the media link.  There are also several other OnDemand presentations including OnDemand ROI and customer success stories.

The concept of custom methods which run directly on the Java Method Server has proven an extremely useful extension point for Documentum developers and solutions architects.  Whether used in a workflow activity to integrate with an enterprise message queue or as an action for Webtop users who need temporarily escalated privileges to apply legal retention, custom Java methods have become a key customization in most customer environments. Features include:

 

  • Lightweight invocation of methods as compared to dmbasic and external Java methods that require execution
  • DFC operations execute on the same host as the Content Server which minimizes the effects of network latency and throughput
  • Can be configured to run as the repository owner which allows them elevated privileges to content when necessary
  • Provide the logic for workflow auto-activities, able to utilize any Java library including the DFC
  • Provide the logic for custom job/methods, again able to utilize the full power of Java and its libraries

 

The Legacy Deployment Process

 

In older versions of Documentum, methods needed to be archived in a jar file and copied into the Java Method Server's dba/java_methods directory.  In 6.6, the Documentum Administrator User Guide  stated this directory is deprecated, and the DmMethods.war/WEB-INF/lib directory should be used instead.

 

Regardless of the exact location of your custom jar and then any supporting 3rd party libraries, they still needed to be physically copied to the Content Server.  And if you were updating the code, a full restart ofthe Java Method Server was necessary because the classes would be cached in a classloader.

 

The New Deployment Process

 

From 6.7 onward, the recommended deployment model for custom methods allows them to be hot-deployed via Composer - in other words, there is no need to physically copy the jar or its dependent libraries into any special directories on the Content Server. They are added as objects into the repository and automatically retrieved from there.

 

When deployed as a BOF module, these methods can be deployed and updated at will without needing to incur any downtime because restarts of the Java Method Server are not necessary.  Each module has its own classloader which allows the ability to hot-update.

 

Best Practice for OnDemand

 

For OnDemand it becomes even more important to follow this newer model of BOF deployment for custom methods. In an OnDemand environment, developers have limited control of the systems and this includes changes made to the Content Server and service restarts (which rules out coping files to dba/java_methods and restarting the JMS at will).  These limits are in place so that availability is not compromised and strict change control is maintained across the DEV/TEST/PROD environments. 

 

If a developer needed a jar placed into dba/java_methods, that would have to be approved and then executed by an OnDemand Support resource at a scheduled time.  This process is not ideal for a developer who is used to a quick code/test/debug cycle.  Adhering to the platform best practice of using BOF modules gives control of the development cycle back to the developer.

 

In addition to the development cycle benefits, there are also clear advantages for the release cycle.  A system that needs 3 jars manually copied to certain directories on the Content Server, and then a JMS or Content Server restart is going to need a maintenance window, and will take a hit on availability of the system.  A BOF module can be deployed into a production environment without any downtime or loss of availability to end users.

 

Implementing Custom Methods as BOF Modules


To jumpstart your conversion efforts, I have built a Composer 6.7 project 'MethodsAsBOFModules', with a simple method and workflow method that follows the newer deployment model. Here are the general steps we will follow:

 

  1. Load the 'MethodsAsBOFModules' project into Composer 6.7
  2. Create .jar archives for each method
  3. Define Composer artifacts for each method: Jar Definition, Module object, Method object
  4. Install the project into a repository
  5. Invoke the method using DQL
  6. Validate by examing the logs

 

Here is a description of the key source files in the project:

 

  • TestModule.java - custom method that establishes a session to the repository, outputs the  dm_server_config.object_name
  • TestWorkflowModule.java - custom method invoked from a workflow, outputs dm_server_config.object_name and then completes workitem
  • MethodHelper.java - contains helper methods for session creation and logging, used by both methods

 


LOAD PROJECT INTO COMPOSER

 

  1. Unzip the MethodsAsBOFModules.zip into a local directory
  2. Open Composer
  3. File > Import > Documentum > Existing Projects into Workspace
  4. Enter the full path of the unzipped directory (e.g. c:\temp\MethodsAsBOFObjects), press "Browse"
  5. "Finish"

 

JARS CREATED FOR EACH METHOD

 

Each of the classes needs its own jar so that Composer can upload it into the repository.  Composer's AntBuilder automatically generates the following jars from the source files:

 

  • TestModule.java -> dist/TestModule.jar
  • TestWorkflowModule.java -> dist/TestWorkflowModule.jar
  • MethodHelper.java -> dist/MethodHelper.jar

 

DEFINE COMPOSER ARTIFACTS FOR EACH METHOD

 

The basic sequence of steps from inside Composer are:

  1. Create the Jar Definition Artifacts
  2. Create the Module Artifacts
  3. Create the Method Artifacts

 

The reader can refer to this project as a template - but essentially each jar gets its own Jar Definition, then a standard module is created that uses the jar definition as an implementation jar. Finally, the method artifact is created that refers to the module name. 

 

Jar Definition artifact

newjardef.png

Module artifact

newmodule.png

Method artifact

newmethod.png


INSTALL INTO REPOSITORY

 

  1. Right-click on project name in left hand tree view, 'Install Documentum Project'
  2. Select repository name
  3. username/password
  4. Press "Login"
  5. Use project and Artifact Settings
  6. Press 'Finish'


INVOKE METHOD


Execute the following DQL using Documentum Administrator, substituting your own local docbase name and install owner:

 

dql> EXECUTE do_method WITH method = 'TestModuleMethod', arguments ='-user_name <installowner> - docbase_name <docbase> -myparam thisisatest'

 

VALIDATE


To see the invocation of each method, see the following files:

 

  • docbase log on the Content Server to see the method launch trace
  • JMS ServerApps.log for DfLogger.WARN output
  • JMS server.log for stdout

 

serverapps-warn.png

TESTING THE WORKFLOW METHOD

 

To validate the 'TestWorkflowModuleMethod', create a very simple workflow with an autoactivity and assign this method as the action.  Again, you will see the launch trace in the docbase log and the output of DfLogger.WARN go the ServerApps.log.  If Process Engine installed, then look instead at that log for the DfLogger output.

 

 

UPDATING THE JARS AFTER SOURCE CHANGES

 

Modify the source file, which will trigger the jars being rebuilt.  Then press 'Remove' on the Jar Definition and then re-add the jar.  Finally, install to repository again with overwrite on.

 

jardefinition-patch.png

 

 

 

 

 

NOTES ON DEPENDENT JARS AND CLASSLOADING

 

There are two different ways that we could have provided the modules access to the shared MethodHelper.jar clases:

 

  1. We chose to add MethodHelper.jar directly to each module, in essence creating a sandboxed version for each module.  This keeps the module and the exact version of the dependency isolated to this module's classloader
  2. The other way would have been to add MethodHelper.jar as a 'Java Library' artifact in Composer, which would make it a shared global BOF library.  Then under the 'Deployment' tab in each module definition, it could have been added as a Java Library.  This would mean that the JMS would load the classes into a shared classloader only a single time.

 

Also note that these module will not have visibility into classes in the dba/java_methods directory, so be sure to include any dependencies in either the 'Core jars' or as a 'Java Library'. 

 

And if a BOF module calls a TBO/SBO, add this TBO/SBO to the 'Required Modules' section of the module definition.

 

ADDITIONAL REFERENCES

 

https://community.emc.com/docs/DOC-4360 - BOF classloaders, antBuilder, ClassDefNotFoundException, ClassNotFoundException, example SBO

http://donr7n.wordpress.com/2008/10/20/methods-as-bof-modules/ - summary of IDfMethod use and hot deployment

http://donr7n.wordpress.com/category/bof/ - describes why shared Java Libraries cannot be reloaded

http://donr7n.wordpress.com/2009/01/20/jar-defs-and-java-libraries/ - discussion of shared libaries and jar definitions




Content delivery is one of the primary use cases for a Content Mangement system.  When users are spread across six different continents, you must have an implementation that ensures timely access for all users - not just those in the local network.  A typical scenario involves the database and primary Content Server deployed in the main North American or European datacenter with remote user groups scattered throughout the world.  These remote offices often have limited network throughput, which makes it even more challenging.

 

Enter Branch Office Caching Services

 

Documentum has dealt with this scenario since its inception and has a myriad of options for streamlining delivery to users in geographically distributed locations or different departments, among them: remote content servers with distributed storage areas, federations with replication, and Branch Office Caching Services (BOCS).  When we, as OnDemand Architects, looked at our customer needs and use cases, it became apparent that BOCS would be instrumental in providing remote users the experience they expected - which essentially boils down to application and content access on par with a local deployment.

 

Working with our customers in the real world, we have seen that web application access for remote users (whether via Webtop, D2, or xCP 2.0) is not signficantly impaired by the incremental increase in latency to return HTML/JS/CSS.  The primary factor in application response and users' perception of performance was the time it takes to transfer content during import, export, and checkin/checkout operations. 

 

BOCS provides the perfect fit to address this bottleneck with remote content transfer.  To make this concrete, consider the illustration below.  Instead of an end user in Argentina needing to upload their 10Mb Microsoft Word document into the primary OnDemand datacenter in North America, the content is transparently uploaded to their locally installed BOCS server.  To the end user, the import operation finishes almost instantly - leaving the BOCS server to later asynchronously upload the 10Mb file to the primary store.  If another team member also in Argentina requires this content, it is already available and simply served from the local BOCS server cache, again offering a very fast response to the end user.  In anticipation of remote use, content can even be pre-cached from the primary filestore.

 

bocs-architecture.png

 

OnDemand with the Distributed Content Feature Set

 

In a customer managed Documentum installation there is a good amount of design, planning, and then work required to get the benefits described above.  The architecture must be throughly understood, a DMS server needs to be installed, ACS needs to be configured properly, and client applications such as D2 or Webtop need configuration changes.  The good news with OnDemand is that a customer only needs to do the following:

 

  1. Be an OnDemand customer using the Documentum Core stack
  2. Request that the 'Distributed Content' feature set be added
  3. Install a lightweight BOCS server following our step-by-step installation guide

 

The only real work that needs to be done by the customer is installing the BOCS servers into any remote office that needs accelerated content features.  And we make that as simple as possible by providing the exact step-by-step instructions in the attached document, OnDemand BOCS Customer Installation Guide

 

We require customers to follow this guide exactly to ensure immediate integration with the OnDemand environment, which includes using pull mode and other naming conventions that guarantee compatability across all the OnDemand certified client applications.

As you can imagine, potential customers have a lot of very legitimate questions when considering the move to EMC OnDemand.  For both new customers as well as those who are migrating their existing content into the EMC secure private cloud one of the questions we hear a lot is, "Why would I choose EMC OnDemand instead of Amazon EC2?". 

 

I love this question.  It gives us a chance to talk about all the EMC OnDemand value-add without the appearance of grandstanding.  And in the end, it is clear to everyone this is an apples to oranges question, but the explanation allows us to highlight some key points that resonate very deeply with an EMC customer evaluating cloud offerings.

 

Cloud Service Models

 

cloud-service-models.png

This diagram illustrates the various flavors of cloud service models, at the left sits an on-premise installation of your IIG software.  You control everything from top to bottom: from power to networking, OS patching, databases, IIG product installation and upgrades, backup and disaster recovery planning, high availability, and client access.  As you well know, it takes a lot of specializations all coming together to run this smoothly.

 

The next column is IaaS (Infrastructure-as-a-Service), which is where EC2 resides.  Instead of requiring rack space in your own datacenter which traditionally is very strictly controlled by your internal IT groups, you can pay for virtual computing in one of Amazon's data centers.  They handle the power, rack space, core networking, perhaps basic operating system and patching, and availability of your server.

 

There are many reasons IaaS can be attractive.  It offers elastic computing power for custom applications, initial capital expense can be lower, it can often give application groups more control than their internal IT group is willing to offer, has very defined SLA contracts, can provide datacenters in multiple world geos, etc.

 

On the other side of the spectrum is SaaS (Software-as-a-Service).  An offering made hugely popular by Salesforce.com, companies are able to outsource an entire business need (e.g. CRM) to a vendor who handles all facets of running the application: availability, backup, security, networking, analytics, etc. for a usage fee.  Functions that are not core or a competitive advantage can be converted and the long-term maintenance costs of an application can be avoided.  However, there is limited capability for customization and control.

 

OnDemand straddles the gap between PaaS (Platform-as-a-Service) and SaaS.  It is typically not a pure SaaS play, although D2 or solution customers such as EPFM may fall under this category.  In our experience, enterpise customers usually require a level of customization and control, and there are still desktop clients apps that usually need to be exposed.  And because the management, operations, EMC Support teams, and select products are tuned specifically to a stack built by our Engineering group we refer to it as a "purpose-built" PaaS.

 

At this point in the conversation with customers, it becomes clear that simply replacing the on-premise hardware/OS supplied by their IT department with a cloud provider's hardware/OS does not address the problems they are looking to solve.  It does not alleviate the risk, offset responsibility, or ease the internal maintenance/operations effort.

 

And for those customers who understand the cloud service model but were really asking the question "Why wouldn't I buy computing resources at Amazon and simply install the software stack myself?",  again it comes back to analyzing the root cause of why you are investigating cloud solutions in the first place.  Taking on more departmental work by building your own IIG software solution stack on cloud hardware (to enterprise specifications) means taking on risk and responsibility that doesn't match your business goals or budget.

 

Back to the Big Picture

 

So let's go back and discuss why customers are prompted to evaluate moving IIG products out of on-premise datacenters and into the cloud in the first place.  I'l speak to the challenges with Documentum since I am closest to that product, but similar stories could be told for Captiva and xPression.

 

For those who have been in the trenches of support and application development for Documentum, it is well understood that it takes a lot of people to keep the doors open.  I don't mind saying this because it is common for any Enterprise product: SAP, Oracle, Informatica, SharePoint, etc.  There are so many layers of integration and infrastructure it takes multiple teams collaborating together to run a successful solution.

 

It is not uncommon to call a meeting with a customer to address a specific set of concerns and have 8-12 people sitting around a single table: a DBA to offer insight into underlying index performance, the hardware storage guy to address SAN driver info and latency, an  IT operations resource who can answer why the OS was patched to a specific level over the weekend, 2 Documentum architects to explain the docbroker and ACS configuration, and several Java/WDK developers to talk to the exact application logic which is showing problems.  Add the Project Manager and you can see how a room can fill up quick.

 

Additionally, maintenance becomes an internal chore that requires a great deal of time and expertise.  You must have someone working on short-cycle projects to patch the system, and then very large projects centered solely on upgrade paths which can take months in order to avoid problems with availability and data loss.

 

All this requires a great deal of expertise and effort.  You must trust your internal infrastructure teams to provide the best backup solution, disaster recovery plan, storage performance, database tuning, and network configuration to serve Documentum's needs.  Then you must rely on your group of Documentum SME to design the correct docbroker balancing, CS instance sizing, patch planning, large upgrade planning, and custom application development.

 

OnDemand allows Customers to focus on their ECM Requirements

 

Now contrast that to the OnDemand purpose-built PaaS, with EMC transparently providing not only the underlying IaaS but also:

 

  • An appropriately sized and scalable Content Server instance(s) that is monitored for availability and performance
  • A tuned database that is continually improved based on feedback from global customer field experiences
  • Dynamic storage capacity based on EMC storage hardware
  • A flexibly scheduled patch and upgrade release cycle, planned and executed by EMC OnDemand resources
  • Purpose built modules that add search, rendition, annotation, and retention services
  • Integration modules for Webtop, D2, Outlook, Mobile, and SharePoint
  • Vertical solutions modules such as EPFM can be simply bolted on to your environment
  • Backups, High-Availability, Disaster Recovery are all developed and executed by OnDemand resources
  • Controlled DEV, TEST, and PROD environments
  • Provisioning and validating the entire system in days, not months

 

Freed from the infrastructure and maintenance tasks above, our customer are able to actively engage their environment:

 

  • Defining the custom object model, methods, and TBO
  • Designing the global folder and security model for users, groups, ACLs
  • Creating workflows for business process optimization
  • Defining retention policies
  • Developing Webtop customization
  • Creating D2 configurations
  • Configuring xCP 2.0 applications
  • Monitoring custom jobs
  • Evaluating EMC modules requested by end users
  • Validating the environments upgraded by EMC
  • Parsing application logs for any errors that may be rooted in customizations
  • Working with EMC OnDemand/Support on product issues
  • Providing first level support for their internal end users

 

 

The reason why EMC OnDemand is so attractive is that it takes on the big underlying horizontal concerns that all customers face and puts the responsibility for managing that back in the hands of specialists within EMC: backup, disaster recovery, availability, storage, scalability, performance, upgrades, and patches are all tuned specifically to the EMC application stack.  And it is continually improved based on issues and feedback from a global customer base.

 

Which allows our customers to invest more time providing valuable business tools that enables their core business strategy.

With the recent acquisition of Syncplicity by EMC, cloud based file management for enterprise end-users has gotten a lot more exciting.  As we explore the possibilities of this new collaboration tool I wanted to illustrate how syncplicity could even serve as the common storage for source control management.

 

Similar to MS Source Safe, CVS, and Subversion, Git is a tool for managing programming source code and resources.  It uses a decentralized model where each client machine has a copy of the entire repository tree.  In this article's deployment model, each remote end user has a working directory and pushes/pulls to the git repository on the local syncplicity folder share - where ultimately it is shared with all other remote users.

 

Note that the storage model described in this article is not appropriate for large development teams; it is aimed at the sole developer who needs to keep code in sync between work and home, or perhaps the project manager who needs the code branch on their iPad for code reviews or presentations.   Syncplicity (as with other cloud based sync solutions) does not immediately synchronize changed files, and in that small window of time a commit of the same file can cause a failure.  See the end of the article for a quick fix in case you see this issue.

 

Step 1 - Install Syncplicity on two distinct hosts

 

a. Go to Syncplicity.com and signup for a free account

b. Install the Syncplicity client on host A

c. Install the Syncplicity client on host B

 

Step 2 - Verify Syncplicity will synchronize files between hosts

 

Share a folder on host A

a. Host A - Create a local folder named c:\temp\syncplicitystore

b. Host A - Create a file named c:\temp\syncplicitystore\fromA.txt on host A which contains the text "hello!"

c. Host A - Click on "Manage and Share folders">"Add a new folder" and select c:\temp\syncplicitystore, press "OK"

 

Accept the shared folder on host B

a. Host B - a popup should appear saying that a new synchronized folder is available,  In the folder location text field type, c:\temp\syncplicitystore

b. Host B - press "Accept Folder". the new folder will open in Windows Explorer

c. Host B - wait for synchronization to complete, you should see fromA.txt, open it using Notepad and add the text "this if from B"; File>Save

 

Verify that changes synchronize between hosts

  Host A - wait for synchronization to complete, open fromA.txt and verify that it contains the text "this is from B"

 

Step 3 - Install the Git client on two distinct hosts

 

Install the command line Git client on both hosts

a. Go to the Git download page and download the latest Windows client (non-GUI)

b. Run the installer exe, accepting all defaults

c. Add "c:\Program Files\Git\bin" to the System PATH

 

Setup an identity on host A

a. Open the command line

b. git config --global user.email hosta

c. git config --global user.name hosta

 

Setup an identity on host B

a. Open the command line

b. git config --global user.email hostb

c. git config --global user.name hostb

 

 

Step 4 - Create a shared "bare" git repository using host A

 

Use a directory from local filesystem to seed a normal git repository

a. create c:\temp\proj1 and navigate into it

a. echo this is the readme > README.txt

b. git init

c. git add README.txt

d. git commit -m "Initial commit"

 

Create bare git repository on shared folder

a. create c:\temp\syncplicitystore\proj1 and navigate into it

b. git init --bare

 

Push files to shared 'bare' git repository

a. cd c:\temp\proj1

b. git remote add origin file://c:/temp/syncplicitystore/proj1

c. git push origin master

 

Delete initial seed directory

Now that the central git repository has been created on the syncplicity share, the c:\temp\proj1 directory can be deleted.  All work will be done from the remote working directories created below in step 5 and 6.

 

Step 5 - create remote working directory for host B

 

a. make directory and navigate to, c:\temp\working

b. git clone "file://c:/temp/syncplicitystore/proj1"

c. cd proj1

d. verify that README.txt says 'this is the readme'

 

Step 6 - create remote working directory for host A

 

a. make directory and navigate to, c:\temp\working

b. git clone "file://c:/temp/syncplicitystore/proj1"

c. cd proj1

d. verify that README.txt says 'this is the readme'

 

Step 7 - verify that a conflicting edit can be resolved

 

Make a modification on both hosts

a. Host A - modify README.txt, add last line that says "edit from A"

 

this is the readme

edit from A

 

b. Host B - modify README.txt, add last line that says "edit from B"

 

this is the readme

edit from B

 

 

Push a commit from host A

a. git add README.txt

b. commit -m "commit from a"

c. git push

 

From host B, pull the latest commit, get a conflict and resolve it

a. wait for Syncplicity synchronization to complete

b. git pull (should report failed merge)

c. open README.txt and modify so line #2 is 'edit from A', and line #3 is 'edit from B' as shown below

 

this is the readme

edit from A

edit from B

 

d. git add README.txt

e. git commit -m "fixed merge"

f. git push

 

Pull the commit from host A, see the file merge successfully

a. wait for Syncplicity synchronization to complete

b. git pull (should report successful insertion and lines #2 and #3) and should look like below

 

this is the readme

edit from A

edit from B

 

 

 

SUCCESS

 

While this solution may not be appropriate for an active multi-user team, it can still be of great utility for a developer who needs keep their work and home projects synchronized, or for anyone who needs a mainly read-only version of the code line. 

 

 

 

Author Note

If you need a "central" Git repository for a production multi-developer project, setup a Git server or use a shared network location.  Similiarly, if you don't need branching/merging or other source control functionality you could have Syncplicity mirror the Git working directory (instead of the Git database itself).

 

 

How to Fix Issue with Simultaneous Edits from two Different Clients

As mentioned in the opening section, the Git repository can be put into an invalid state if two different remote clients commit the same file at the same time.  If you see "fatal: Reference has invalid format" or "conflicting version" messages upon push or pull and cannot rectify the situation using the git client commands, then you can do a search in the syplicity folder for file names that contain "conflicting" and delete them.  This will recover your Git repository to a working state.

 

 

References:

http://en.wikipedia.org/wiki/Git_(software)

http://tumblr.intranation.com/post/766290743/using-dropbox-git-repository

http://www.gitguys.com/topics/shared-repositories-should-be-bare-repositories/

http://www.gitguys.com/topics/creating-a-shared-repository-users-sharing-the-repository/

http://kahthong.com/2012/05/how-use-google-drive-or-dropbox-host-your-private-git-repositories

http://push.cx/2011/dropbox-and-git

http://stackoverflow.com/questions/1960799/using-gitdropbox-together-effectively

The Documentum Foundation Services (DFS) introduced developers to the 'DFS Data Model', a rich object model that is capable of representing complex repository objects and relationships during interactions with content services.    For those with a DFC programming background, it can be a challenge to shift into the DFS paradigm which focuses on service oriented calls and relies on the data model to fully describe the requested transformations.

 

Based on my contact with customers through formal Service Requests as well as the EMC Support Forums, I see that many architects, when presented with this unfamiliar landscape instantly assume that the best course of action is to design a custom model to shield other developers from the perceived complexity of the DFS data model.  Although well intentioned, I believe this initial reaction to change can have serious implications that are not often considered or understood at the time of their implementation.

 

While I believe that abstracting the construction of the DFS data model carries a great deal of value, I believe that replacing the DFS data model with a custom model should be done only with deliberate purpose and awareness.   I will use this article to explore the motivations behind the development of these "simplified" models, their ramifications in a long-term SOA strategy, and how you can deliver convenience without making integration unnecessarily difficult or hindering the building-block nature of SOA.

 

The Initial Reaction

 

One of the first things noticed by DFC programmers is the amount of setup required with the DFS data model.  Let's go through one quick example to make this point concrete, here is an example of using the DFC to link a pre-existing object to the 'Temp' cabinet:

// identify object
IDfSysObject sysObject = (IDfSysObject) session.getObjectByPath("/dmadmin/test.doc");

// create link relationship
sysObject.link("/Temp");

// update object
sysObj.save();

 

Here is the equivalent using the DFS SDK:

// identify object
DataObject dataObject = new DataObject(new ObjectPath("/dmadmin/test.doc"),"dm_document");

// create link relationship
ObjectIdentity folderTarget = new ObjectIdentity(new ObjectPath("/Temp"), repository);
ReferenceRelationship referenceRelationship = new ReferenceRelationship();
referenceRelationship.setName(Relationship.RELATIONSHIP_FOLDER);
referenceRelationship.setTarget(folderTarget);
referenceRelationship.setTargetRole(Relationship.ROLE_PARENT);
referenceRelationship.setIntentModifier(RelationshipIntentModifier.ADD);
dataObject.getRelationships().add(referenceRelationship);

// update object
objectService.update(new DataPackage(dataObject), new OperationOptions());

 

There are many examples of DFC "one-liner" operations such as IDfSysObject.link that require more statements when expressed in the DFS data model, as illustrated above.   The initial exposure to DFS programming often leads architects to the premature conclusion that the DFS data model is too difficult to work with directly because of the verbosity.

 

Based on these initial assumptions about the complexity of the DFS Data Model, it is understandable why an intermediate layer would be considered.  However, I would assert that the DFS data model is quite simple to understand and it is not the use of of the model that adds complexity, but the construction of the model that one should seek to simplify.

 

Design Options

 

We will first consider the consequences of creating a custom data model (Option A), then contrast that with preserving the DFS data model but offering convenience methods or Builders to simplify the construction of the model (Option B).

 

Option A. Creating a Custom 'Simplified' Data Model

 

One way to approach the problem is to create a new data model to represent objects and transformations.  In this solution, users of the new model are presented with a simplified set of objects, which only expose the immediate and short-term needs of the consumers.

 

As an example, let's imagine we created a framework where a developer could use a simple type called MySimpleDFSObject to assign the identity and relationships of a document in a convenient manner.  We would also want to insulate users from having to call the DFS service methods directly which requires extra parameters (i.e. OperationOptions, etc.), so we have a MyServicesFactory that instantiates streamlined services that internally provides defaults for the most common options and methods.

 

// identity object
 MySimpleDFSObject myObj = new MySimpleDFSObject("/dmadmin/test.doc","dm_document");

 // create link relationship
 myObj.link("/Temp");

 // update object
 MyObjectService myService = MyServicesFactory.getMyObjectService(mySession);
 myService.doSimpleSave(myObj);

 

The example framework above may seem like a ideal solution, but while it is successful at keeping DFC developers in their comfort zone and shielding them from SOA concepts, it also has serious consequences that need to be weighed.

 

PROS

 

  • Intuitive to the domain - because of the concise and simplified nature of these custom objects and their methods, the functionality has been tuned to the exact methods that are going to be used by the initial developer population (setting attributes, saving, etc), and therefore the user conceptual model matches the API.

CONS

 

  • Simplicity is fleeting - version 1.0 of your custom API could be the most intuitive and minimalist object model ever created, but in two months your end users will be asking about BOCS content transfer options, then permission sets, then advanced structured queries, etc...it is only a matter of time before your model needs to represent the equivalent functionality of the DFS data model.

  • Build versus Buy - as mentioned in the item above, users will demand more functionality/options/services.  Each enhancement to your API will need to be coded, comprehensively tested, and deployed.  This requires a significant effort and therefore cost, and it is smarter to shift these costs to EMC which has gone to great efforts to maintain a stable API that is continually patched based on real-world use.

  • Lock-in to custom layer - since the custom model is fixed, any significant features or new services added by EMC to the platform will not be available to users until the custom layer is enhanced.  For example, in the upgrade to Documentum D6.5, seven new services were added to the platform. Using a custom layer, these would have all been unavailable until the custom data model was enhanced.

  • SOA Integration potentially more difficult - The DFS object model is uniform across the entire spectrum of content services delivered on the platform including the: ObjectService, Search service, Workflow Service, CTS TransformationService, RPM Services, etc..  This uniformity supports the orchestration goals and service composition advocated in an SOA architecture.  DFS object types can be easily passed in to custom DFS services, and then immediately used as parameters to other DFS services.  Custom types force service writers to constantly marshall data types into those understood by the target service.

 

Option B. Developing Methods that Simplify Manipulation of the DFS Data Model

 

The other approach to simplify DFS development is to expose users directly to the data model, but have utility methods or builders that take on the brunt of the work.  For illustration, let's imagine a simple utility class, DFSHelper, that has static methods that assist in the construction of the DFS data model for common tasks:

 

 

// identify object
 DataObject dataObject = DFSHelper.constructDataObject("/dmadmin/test.doc","dm_document");

 // create link relationship (originally took 7 lines to represent)
 DFSHelper.link(dataObject,repository,"/Temp");

 // update object
 IObjectService objService = DFSHelper.getObjectService();
 objService.update(new DataPackage(dataObject), DFSHelper.getDefaultOperationOptions());

 

 

What may not be obvious to readers not looking at the DFS JavaDocs is that DataObject, DataPackage, andIObjectService are part of the DFS SDK.  And the DFSHelper class does not preclude the developer from using the DFS data model directly versus the convenience methods it provides.

 

 

PROS

 

  • Simplifies exactly where needed - the DFS data model is full-featured, documented well, and is relatively simple.  Providing a utility class or builder to bundle common sequences of operations is a natural extension.

  • Stable data model - the DFS data model contains all the objects and options available to the core DFS services as well as the product specific services, and is mature enough to have been through several releases already.  End user requests for enhanced functionality are likely to already be satisifed.

  • Take immediate advantage of new services - there are many core DFS services as well as those tied specifically to products like CTS, Records Manager, CenterStage, etc.  Each new release of Documentum or a product will bring new services that can instantly be leveraged using the DFS common data model.

  • Encourages 'building block' SOA - the nirvana of an SOA architecture is the ability to take disparate services scattered throughout an organization and orchestrate/aggregate these building blocks into a valuable business service.  Using a common object model is key to this initiative because without it, integration of several services is an exercise in writing adaptors and transformations to satisfy the input parameters and output results of each service.

CONS

 

  • General data model - a general model such as the DFS data model must fulfill broad requirements, a custom model can be tailored exactly to the end user solution.

Summary

 

In this article we have gone over two different approaches to simplifying DFS development with respect to the data model.  There are no absolutes in design, but I hope I have presented a strong argument toward direct use of the DFS data model, and that the facts presented here will allow you to make an informed decision based on the long-term implications.