#1013 Removing config files from Mercurial

tactics Sun 7 Mar 2010

While making some patches last week, I ran into problems with the configuration files in Fantom being stored under Mercurial. I wanted to share my thoughts on removing these files from the Mercurial repository.

The problem is that configuration files are meant to be changed by each user, but those changes are not meant to be propagated to version control. Unfortunately, whenever a change is made, Mercurial picks up the change automatically. This makes it difficult to create patches because these files must be carefully handled to ensure they are not included in the diff. (I ended up having to remove the same 15 lines from my diffs a dozen times and one change STILL got through!)

I asked about the standard solution to this "configuration file versioning" issue on StackOverflow. The response I got was that the best practice was to keep configuration files out of version control and find a way to generate them after hg pull.

My idea is to remove etc/* from our Mercurial repository and then have adm/bootstrap.fan create it when a repository is first cloned.

There are a few different ways the second step could be implemented. The easiest would be to version a directory called etc.defaults/ containing all the default config files. Then adm/bootstrap.fan would simply need to rename this file to etc/ and everything would work. It would also be the easiest solution for users who don't use the bootstrap script.

(Of course, for releases, etc.defaults/ would be replaced by etc/ so users would be able to run Fantom with zero setup).

The other idea I had was a bit more involved, but could be useful when the pod ecosystem starts to grow. Essentially, each pod would contain its own default config file. Or maybe each pod would be able to automatically configure itself. It's something to put on my Fantom wishlist.

Let me know what you guys think :)

brian Sun 7 Mar 2010

It is nice to have a "default" configuration file you can just tweak when you install the software.

But for when you are working directly off tip, this was one problem the PathEnv was designed to solve. If you setup your working directory to something other than tip, then the working directory configs override the tip configs. This is probably recommended approach (but doesn't work for bootstrap).

Also, not sure you are aware of this, but you can override etc/build/config.props with environment variables - see build::BuildScript.config. This is what I use myself for alternate bootstrap setups.

Edit: setting up bootstrap for Windows still has some work, sort of been thinking of switching Windows to use Unix model with FAN_SUBSTITUTE.

katox Sun 7 Mar 2010

Generally it is safer to provide a template and a check of users' configuration, I agree. Most of the time it would be sufficient to override default config in Env repository but unfortunately that is not possible for bootstrap.

I workaround that problem by making a separate branch which I rebase on mercurial tip. My real working branches start from this config branch.

tactics Sun 7 Mar 2010

I agree that we need a default configuration. But there should be a distinction between the default configuration and actual configuration. Right now, the two are mixed together.

At the heart of this issue, versioning your actual config files is a mistake. Changes I make to my own setup should not be noticed by Mercurial. But since we have them in the repository, they are.

This isn't a huge issue right now, but as Fantom grows a larger developer base, we want to make it as easy as possible to get started. Right now, the config files create an unnecessary, annoying barrier for working out of a repository.

Instead, by renaming etc/ to etc.defaults/ in the repository, a developer only has to copy a single directory before hacking. They can configure their environment as much as they want, and Mercurial doesn't bug them about it. This copying step can even be done automatically by adm/bootstrap.fan.

For release builds, this step wouldn't be necessary, of course. A regular user downloading Fantom would just see the etc/ directory containing all the defaults.

Using PathEnv works, but it isn't the optimal solution. It requires setting an environment variable, which can get messy. Environment variables either need to be set every time you open a new shell or they need to be changed every time you decide to work on a different repository.

Using PathEnv also requires using the Fan env mechanism. In my particular case, I was trying to write a new Env, so I couldn't make use of both PathEnv and my new Env at the same time. Also, as Katox mentioned, using PathEnv for bootstrapping is pretty darned complicated.

Brian and Andy, I know you guys have your workflow set up in a way which suits you, but this issue will mostly affect outside developers, so please give some consideration to that. Personally, this one small change (renaming etc/ to etc.defaults/ in the repo) would have saved me an hour. As more developers join the project, it this change could save a lot of time and frustration for many others.

brian Mon 8 Mar 2010

Personally, this one small change (renaming etc/ to etc.defaults/ in the repo

If I understand your proposal, is that /etc is put into .hgignore and taken out of version control. Then everytime we need make a change to a file in /etc we need to remember to manually copy it to /etc.default? I don't think that is really a viable solution.

I think using environment variables is the simplest, cleanest solution. We use that technique to solve this problem for both the Fantom and the SkyFoundry codebase. I use it myself for the build config on my MacBook. The only thing we are really taking about here is the location of jdkHome, and I just stick it my .profile file.

Although if we can come up with something better, I'm all ears. But I don't want a solution that subverts version control.

andy Mon 8 Mar 2010

The two major issues I see here are:

etc/build/build.props    // locations of JDK and .NET
lib/sys.props            // Windows fan subs

Today both enforce a directory structure. So to work off tip you must modify both these files and always manually deal with them for changes. If we move to the unix model for fan substitutes, and env var overrides work for build.props I think everything will be rosy - but need to do both.

brian Mon 8 Mar 2010

So just to rephrase what Andy said:

Today on Unix/OS X: you can set FAN_BUILD_JDKHOME along with FAN_SUBSTITUTE, and everything including bootstrap build should work with no config file changes.

Today on Windows: lib/sys.props must be changed for bootstrap substitutes. But current plan Andy and I have discussed is to replace that design with FAN_SUBSTITUTE env var design used on Unix.

katox Mon 8 Mar 2010

I agree that if configuration files are under version control users (or even developers) shouldn't be forced to edit them in place unless they want to introduce new configuration options or something similar.

Some sort of install (rather then working of the src tree) could be also considered.

If env vars are to be prevalent I think we should also consider adding options to set/override them on the command line (like -Dvariable=value in java).

Having an env variable is bearable though I find the management of JAVA_HOME, ANT_HOME, M2_HOME, DEFAULT_MAVEN_OPTS, ORACLE_*, FAN_*, CATALINA_HOME and whatever _HOME a bit annoying.

tactics Mon 8 Mar 2010

If I understand your proposal, is that /etc is put into .hgignore and taken out of version control. Then everytime we need make a change to a file in /etc we need to remember to manually copy it to /etc.default? I don't think that is really a viable solution.

The first notion is correct. The etc/ (the actual configuration used on the system) will be added to .hgignore and its contents will be left alone by Mercurial.

The second notion, that you must manually copy every change from etc/ to etc.defaults/ is not quite what I had in mind.

The etc.defaults/ directory will be used for default configurations only. These rarely change. Since the transition to .props files, the only changes to any files in the etc/ directory have been rolling up the buildVersion number in etc/build/config.props. The only time you would make any changes at all to etc.defaults/ is when you change the set of properties that a pod can react to.

I'll illustrate my idea in more detail, because I don't know if I've communicated it well enough.

Configurations today

Let's take etc/sql/config.props as an example:

test.connection=jdbc:mysql://localhost:3306/fantest
test.username=fantest
test.password=fantest
test.dialect=sql::MySqlDialect

Now honestly, there will be no one who actually has a MySQL database named fantest sitting around on their box. This file is only there to show you the "template" for how the sql pod's config.props. should look. When you want to use the sql pod for your own database, you change the username, password, and connection string to point to your own database:

foo.connection=jdbc:mysql://localhost:3306/foo
foo.username=bar
foo.password=baz
foo.dialect=sql::MySqlDialect

But now, if we hg diff our working directory, Mercurial sees we changed our configuration:

$ hg diff
--- a/etc/sql/config.props	Tue Feb 02 10:17:07 2010 -0500
+++ b/etc/sql/config.props	Tue Feb 02 10:33:47 2010 -0500
@@ -6,7 +6,7 @@

-test.connection=jdbc:mysql://localhost:3306/fantest
-test.username=fantest
-test.password=fantest
-test.dialect=sql::MySqlDialect
+foo.connection=jdbc:mysql://localhost:3306/foo
+foo.username=bar
+foo.password=baz
+foo.dialect=sql::MySqlDialect
$ 

I see this as a problem. I have changed my config file, something which is specific to me, but Mercurial assumes I want to share these changes with the rest of the world. If I create a patch, I must be very careful to undo these changes to my repository before submitting them. Otherwise, I end up changing the default configurations for everyone else.

Moving default configurations to etc.defaults/

Now imagine that instead we rename etc/ to etc.defaults/ in the repository. What are the consequences?

Most noticeable, is that now our repository doesn't contain an etc/ directory -- nothing can run. We must first create etc/. To do that, we just copy etc.defaults/ to etc/. We can fan src/buildall.fan and everything will compile just fine.

The cool thing about this setup is that Mercurial now ignores my custom configuration. I can change etc/sql/config.props to use my foo database as above. But now if I run hg diff, what I see is this:

$ hg diff
$

Mercurial doesn't see the working config files. I can change them as much as I want, and Mercurial doesn't care.

Now suppose, for example, a year from now we want to add a new hypothetical config variable for the sql pod called encrypt. It's a simple boolean in the configuration file, and when it's set to true, SQL statements are encrypted before they are passed over the network.

I can change this in my etc/sys/config.props while I'm developing this new killer feature. When I'm happy with the change, I have one more step to perform. I need to make sure a default value for encrypt is set for everyone who uses it. This is the part where I do need to copy over etc/sys/config.props to etc.defaults/sys/config.props. Once I do, Mercurial will "see" the change to the default config file, and it will be included in my patch.

This changing of defaults is a rare event. It's hardly non-viable. Changing the working/actual configuration, on the other hand, is a relatively frequent event that creates headaches in Mercurial.

Misc

So that's my proposal. It's just creating the distinction between two kinds of configuration: working/actual and the defaults.

Honestly, I had forgotten about lib/sys.props when I wrote this. The fan substitutions tend to "just work". Though I do support changing them to work like how they do in unix.

brian Mon 8 Mar 2010

The problem is that etc is much bigger than just build config props. Some of these files like the unit, ext, and flux databases are pretty sophisticated. I don't like the idea of having to manually remember to sync changes.

The problem is this:

  • everything under etc is designed to be changed (that is why this stuff isn't hard coded)
  • it requires version control

I think the PathEnv is the best general purpose solution - it lets you cleanly override any file or single prop from etc without touching the core files. I don't think we want to redesign that feature.

So to me we are talking specifically about the bootstrap problem.

If you agree to that we should frame this discussion as bootstrap, then the only question is there really any problem with using the two environment variables (assuming we fix Windows)

tactics Mon 8 Mar 2010

So to me we are talking specifically about the bootstrap problem.

I ran into this problem while trying to write my "hello world" patch of all things :P I don't see it as a bootstrap-only issue, though that is a big part of it.

If you are set against this kind of change, I guess there is no helping it. I will try the PathEnv solution instead and see how that works.

Of course, any improvements to the bootstrap process are appreciated. It's gotten pretty easy to bootstrap, but there are still a few kinks here and there.

EDIT - Come to think of it, it reared its head in "hello world" because of the bootstrap process. While I still feel it's more general than just the bootstrap, realizing this, it doesn't seem nearly as important. As long as the core runtime can be bootstrapped and run without any changes to etc/, that's good enough for me.

katox Mon 8 Mar 2010

I ran into this problem while trying to write my "hello world"

Maybe the default Env setup should use a different repo than dev or rel (like former FAN_REPO). Everyone would have to set it up for hello worlding - thus avoiding to edit the default configuration.

Setup for bootstrap would still require environment variables, FAN_BUILD_JDKHOME, FAN_SUBSTITUTE ... and what about FAN_DEVHOME? Shouldn't I set it too to avoid duplicating all pods into the new fan repo?

andy Mon 8 Mar 2010

I'm not sure I would get too bogged down with this bootstrap case - no one (at least in the right mind) would build anything substantial against the tip of a language. You're going to pick a stable build and upgrade as you move along when its viable.

So this is only an issue for people who want to experiment with the current language codebase. If you are not interested in modifying anything, the established convention is to use a PathEnv - which makes this a moot issue I believe.

We just need a way for those people to build the core codebase without having to muck with hg changes - and the only two things needed for that were listed above.

My two cents :)

brian Tue 9 Mar 2010

I think my goal is to separate this issue into to two different problems:

  1. bootstrap
  2. general purpose solution

I think for bootstrap everyone is in agreement that having two environments seems like a simple and straight forward solution. I still have to fix the win32 launcher.

I think the key point for the general solution is that I believe Env is the preferred way to takle this issue. This is exactly what Env was originally designed for (not necessarily working from hg, but not having to touch your core install). What that means we need to keep in mind that any new Envs should always allow overriding etc configuration. So even if we have something like ScriptEnv, it probably still needs to chain to easily enable the override functionality of PathEnv. Something to keep in mind as work thru that design.

Login or Signup to reply.